1 回答

TA貢獻(xiàn)2019條經(jīng)驗(yàn) 獲得超9個(gè)贊
您可以將其作為一個(gè)兩階段過程來(lái)執(zhí)行。先計(jì)算一個(gè)映射系列,然后按簇映射:
s = df.query('tag == 1')\
.sort_values('amount', ascending=False)\
.drop_duplicates('cluster')\
.set_index('cluster')['name']
df['highest_name'] = df['cluster'].map(s)
print(df)
cluster tag amount name highest_name
0 1 0 200 Michael NaN
1 2 1 1200 John John
2 2 1 900 Daniel John
3 2 0 3000 David John
4 2 0 600 Jonny John
5 3 0 900 Denisse Kely
6 3 1 900 Mike Kely
7 3 1 3000 Kely Kely
8 3 0 2000 Devon Kely
如果您想使用groupby,這是一種方法:
def func(x):
names = x.query('tag == 1').sort_values('amount', ascending=False)['name']
return names.iloc[0] if not names.empty else np.nan
df['highest_name'] = df['cluster'].map(df.groupby('cluster').apply(func))
添加回答
舉報(bào)