開(kāi)心每一天1111
2021-09-28 13:44:20
我有兩個(gè)要合并的熊貓數(shù)據(jù)框。數(shù)據(jù)框的大小不同,所以我只希望df1保留那些出現(xiàn)在其中的數(shù)據(jù)框- 有些學(xué)生只出現(xiàn)在df1或之一中df2。df1具有標(biāo)題,['student', 'week1_count', 'week1_mean', ..., 'week11_count', 'week11_mean']并使用除'student'列之外的所有單元格初始化為零。df2具有標(biāo)題['student', 'week', 'count', 'mean']并填充了相應(yīng)的'student'. 'week'是一個(gè)介于 1-11 之間的整數(shù),并且'count'和'mean'是相應(yīng)的浮點(diǎn)數(shù)。我想要做的是對(duì)于給定的學(xué)生 in df1and df2,在給定的一周內(nèi),取相應(yīng)的'count'and'mean'值并將其放入df1相應(yīng)的列中。例如, 的'week'值1意味著 in'count'和'mean'in的值df2將分別放入'week1_count'和'week1_mean'中df1。關(guān)于我一直循環(huán)range(11)并創(chuàng)建子集數(shù)據(jù)框的幾周,但想知道是否有更快的方法。IEdf1: student week1_count week1_mean week2_count week2_mean ... '0' 0 0 0 0 ... '2' 0 0 0 0 ... '3' 0 0 0 0 ... . . . '500' 0 0 0 0 ... '541' 0 0 0 0 ... '542' 0 0 0 0 ... 和df2: student week count mean '0' 1 5 6.5 '1' 1 3 7.0 '2' 1 2 8.2 '2' 2 10 15.1 . . . '500' 2 12 4.3 '540' 4 1 3.0 '542' 1 4 1.2 '542' 2 9 5.2所以預(yù)期的結(jié)果df_result: student week1_count week1_mean week2_count week2_mean ... '0' 5 6.5 0 0 ... '2' 2 8.2 10 15.1 ... '7' 0 0 0 0 ... . . . '500' 0 0 12 4.3 ... '541' 0 0 0 0 ... '542' 4 1.2 9 5.2 ... 我已經(jīng)嘗試了各種例程 - 這些例程都沒(méi)有按預(yù)期工作 - 在熊貓中,例如:合并:使用“左”連接,因?yàn)槲蚁胍猟f1. 我嘗試重命名列df2以匹配列名。加入連接更新:嘗試將所有單元格初始化為df1tonp.nan而不是0.0,然后使用df1.update(df2)(在將 cols 重命名為 in 之后df2)用預(yù)期的值更新所有 nan 值試圖只設(shè)置值:即類似df1[rows_in_both][['week1_count','week1_mean']] = df2[rows_in_both][['count','mean']]但也不起作用
1 回答

慕后森
TA貢獻(xiàn)1802條經(jīng)驗(yàn) 獲得超5個(gè)贊
這更像是一個(gè)update問(wèn)題而不是 merge
s=df2.pivot(index='student',columns='week',values=['count','mean'])# pivot df2 to format it to df1 like .
s.columns.map('week{0[1]}_{0[0]}'.format) # modify the column
Out[645]:
Index(['week1_count', 'week2_count', 'week4_count', 'week1_mean', 'week2_mean',
'week4_mean'],
dtype='object')
s.columns=s.columns.map('week{0[1]}_{0[0]}'.format)
然后我們做 update
df1=df1.set_index('student')
df1=df1.update(s)
添加回答
舉報(bào)
0/150
提交
取消