2 回答

TA貢獻(xiàn)1848條經(jīng)驗(yàn) 獲得超10個(gè)贊
agg在您的情況下,將標(biāo)記一列作為源,您可以在之前創(chuàng)建另一列g(shù)roupby
df['New'] = np.where(df['is_main_video'], df['file_size'], 0)
summary_df = df.groupby(['provider', 'id']).agg(
title =('title', 'first'),
file_size = ('New', 'sum')
).reset_index()
更新
summary_df = df.assign(New = np.where(df['is_main_video'], df['file_size'], 0)).groupby(['provider', 'id']).agg(
title =('title', 'first'),
file_size = ('New', 'sum')
).reset_index()

TA貢獻(xiàn)1858條經(jīng)驗(yàn) 獲得超8個(gè)贊
您可以Series.where暫時(shí)“忽略”您的 file_sizes,其中“is_main_video”為 False,然后執(zhí)行 groupby 操作來(lái)對(duì)剩余內(nèi)容進(jìn)行求和:
import pandas as pd
df = pd.DataFrame({
"provider": ["A", "A", "A", "B", "B"],
"title": ["hello", "world", "pandas", "example", "here"],
"is_main_video": [True, False, True, True, False],
"file_size": [10, 12, 20, 19, 10]
})
print(df)
provider title is_main_video file_size
0 A hello True 10
1 A world False 12
2 A pandas True 20
3 B example True 19
4 B here False 10
aggregated_df = (df.assign(file_size=df["file_size"].where(df["is_main_video"]))
.groupby("provider", as_index=False)
.agg(
title=("title", "first"),
file_size=("file_size", "sum"))
)
print(aggregated_df)
provider title file_size
0 A hello 30.0
1 B example 19.0
添加回答
舉報(bào)