我有一個(gè)如下所示的數(shù)據(jù)框:import pandas as pdZ = pd.DataFrame({'Product': ['Apple', 'Apple', 'Apple', 'Orange', 'Orange], 'Selling Price': [1.1, 1.2, 1.3, 2.1, 2.2]})有數(shù)千種獨(dú)特的產(chǎn)品和數(shù)億的售價(jià)。我如何有效地報(bào)告每種獨(dú)特產(chǎn)品的平均售價(jià)?Result = pd.DataFrame({'Product': ['Apple', 'Orange'], 'Average Selling Price': [1.2, 2.15]})挑戰(zhàn)在于數(shù)據(jù)存儲(chǔ)在數(shù)百個(gè)不同的 .csv 文件中(文件名存儲(chǔ)在列表中files),我無(wú)法同時(shí)將其加載到我的環(huán)境中。所以我會(huì)做類似的事情for i in files: X = pd.read_csv(i) # add unique products to the data frame Z # add the sum of their selling prices to Z # add the number of times the product was sold# for each unique product, divide the sum of selling prices by the number of times that product was sold感謝您的任何幫助,您可以提供!
1 回答

當(dāng)年話下
TA貢獻(xiàn)1890條經(jīng)驗(yàn) 獲得超9個(gè)贊
final_df = pd.DataFrame()
for i in files:
X = pd.read_csv(i)
X_agg = X.groupby('Product', as_index=False).agg({'Selling Price':['count', 'sum']})
X_agg.columns = ['Product', 'sale_count', 'selling_sum']
final_df = pd.concat([final_df, X_agg])
final_df = final_df.groupby('Product', as_index=False).agg({'sale_count':'sum', 'selling_sum':'sum'})
添加回答
舉報(bào)
0/150
提交
取消