我有一個帶有 datetime (TransactionDate) 列和一個 CustomerID 列和一個 Sales 列的 pandas 數(shù)據(jù)框。我想對數(shù)據(jù) Daily 重新采樣以每天匯總銷售額,但分別針對每個 CustomerID。我嘗試了兩種不同的方法,但都沒有產生預期的結果。當我嘗試這樣做時,通過僅將 TransactionDate 列設置為索引,Sales 總和,但 CustomerID 列也是如此,我丟失了有關哪個 CustomerID 產生了多少銷售額的信息。當我嘗試通過將 TransactionDate 列和 CustomerID 列設置為索引來執(zhí)行此操作時,出現(xiàn)錯誤TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'MultiIndex'我該怎么做才能通過 CustomerID 獲得每日銷售額的數(shù)據(jù)框?完整數(shù)據(jù)的代碼如下:import pandas as pdimport numpy as npimport randomrandom.seed(30)np.random.seed(30)InvoiceNo = range(10000,10500)print('len(InvoiceNo)',len(InvoiceNo))start_date,end_date = '1/1/2015','12/31/2019'date_rng = pd.date_range(start= start_date, periods=len(InvoiceNo), freq='3H')length_of_field = date_rng.shape[0]df = pd.DataFrame(date_rng, columns=['TransactionDate'])df['InvoiceNo']=InvoiceNodf['Quantity'] = np.random.randint(18,100,size=(len(date_rng)))Items = ('ItemA','ItemB','ItemC','ItemD')group_1 = np.random.choice(Items, len(InvoiceNo), p = [0.3, 0.5, 0.15, 0.05])Price = (10.0,20,30,40)dict_item_price = dict(zip(Items,Price))PriceList = [dict_item_price[i] for i in group_1]CustomerID = (18750,18751,18752,18753,18754,18756,18757)group_2 = np.random.choice(CustomerID, len(InvoiceNo), p = [0.10, 0.25, 0.15, 0.05,0.35,0.05,0.05])df['ItemCode'] = group_1df['Price'] = PriceListdf['CustomerID'] = group_2df['CustomerID'].astype(str)df['Sales']=df['Price']*df['Quantity']print('\ndf:')print(df)print(df.dtypes)df1 = df[['CustomerID','Sales','TransactionDate']].copy().set_index(['TransactionDate'])print('\n df1 :')print(df1)total_sales = df['Sales'].sum()print('\ntotal sales :',total_sales)daily_sales = df1.resample('D').sum()print('\n daily_sales :')print(daily_sales)
1 回答

慕雪6442864
TA貢獻1812條經驗 獲得超5個贊
就像是:
df.groupby(['CustomerID', df['TransactionDate'].dt.normalize()])['Sales'].sum()
或者
df.groupby(['CustomerID', df['TransactionDate'].dt.to_period('D')])['Sales'].sum()
添加回答
舉報
0/150
提交
取消