2 回答

TA貢獻(xiàn)1827條經(jīng)驗(yàn) 獲得超9個(gè)贊
reduce考慮使用suffixes參數(shù)對(duì)merge重復(fù)列名進(jìn)行一些處理并刪除中間子列的鏈合并:
def proc_build(x,y):
temp = (pd.merge(x, y, left_on='parents', right_on='child',
how='left', suffixes=['_',''])
.fillna('-'))
return temp
final_df = (reduce(proc_build, [df, df, df, df])
.set_axis(['child', 'parents',
'child1', 'A',
'child2', 'B',
'child3', 'C'], axis='columns', inplace=False)
.reindex(['child', 'parents'] + list('ABC'), axis='columns')
)
print(final_df)
# child parents A B C
# 0 Joe Steffani Dani Selma Kevin
# 1 Joe Steffani Dani John -
# 2 Anna Bob Selma Kevin Robert
# 3 Anna Steffani Dani Selma Kevin
# 4 Anna Steffani Dani John -
# 5 Steffani Dani Selma Kevin Robert
# 6 Steffani Dani John - -
# 7 Bob Selma Kevin Robert -
# 8 Rea Anna Bob Selma Kevin
# 9 Rea Anna Steffani Dani Selma
# 10 Rea Anna Steffani Dani John
# 11 Dani Selma Kevin Robert -
# 12 Dani John - - -
# 13 Selma Kevin Robert - -
# 14 John - - - -
# 15 Kevin Robert - - -
要擴(kuò)展另一列,例如D ,請(qǐng)?jiān)赼nd中添加另一個(gè)帶有附加列表項(xiàng)的df可迭代參數(shù),特別是and 。雖然有一些方法可以使這些項(xiàng)目動(dòng)態(tài)化,但可能會(huì)變得昂貴,因此應(yīng)該以一些聲明性的強(qiáng)調(diào)來(lái)處理。reduceset_axisreindex['child4', 'D']list('ABCD')reduce
final_df = (reduce(proc_build, [df] * 5)
.set_axis(['child', 'parents',
'child1', 'A',
'child2', 'B',
'child3', 'C',
'child4', 'D'], axis='columns', inplace=False)
.reindex(['child', 'parents'] + list('ABCD'), axis='columns')
)
print(final_df)
# child parents A B C D
# 0 Joe Steffani Dani Selma Kevin Robert
# 1 Joe Steffani Dani John - -
# 2 Anna Bob Selma Kevin Robert -
# 3 Anna Steffani Dani Selma Kevin Robert
# 4 Anna Steffani Dani John - -
# 5 Steffani Dani Selma Kevin Robert -
# 6 Steffani Dani John - - -
# 7 Bob Selma Kevin Robert - -
# 8 Rea Anna Bob Selma Kevin Robert
# 9 Rea Anna Steffani Dani Selma Kevin
# 10 Rea Anna Steffani Dani John -
# 11 Dani Selma Kevin Robert - -
# 12 Dani John - - - -
# 13 Selma Kevin Robert - - -
# 14 John - - - - -
# 15 Kevin Robert - - - -

TA貢獻(xiàn)1829條經(jīng)驗(yàn) 獲得超7個(gè)贊
這是我的一個(gè)粗略的解決方案。你應(yīng)該優(yōu)化它。
加載所有數(shù)據(jù)幀
將所有數(shù)據(jù)框的名稱保存在列表中
list_data = [data1,data2]
list_df = []
i = 0
for data in list_data:
vars()[f'df{i}'] = pd.DataFrame(data)
list_df.append(f'df{i}')
i += 1
然后創(chuàng)建2個(gè)代理變量;
df_family :這將是一個(gè)輸出
last_df :為了打破循環(huán),如果父列中的每一行都是'-',但列表中還剩下數(shù)據(jù)框。
last_df = False
df_family = pd.DataFrame()
這部分將根據(jù)需要將數(shù)據(jù)框合并在一起。我還將名稱更改為 1,2,...,n,以便您輕松重命名。
for df in list_df:
if last_df:
break
if (eval(df)['parents'] == '-').all():
last_df = True
if df_family.empty:
df_family = eval(df)
else:
df_family = pd.merge(df_family,eval(df), how = 'left', left_on = df_family.columns[-1], right_on = eval(df).columns[0])
df_family.drop(columns = [eval(df).columns[0]], axis = 1, inplace = True)
list_cols = [i for i in range(df_family.shape[1])]
df_family.columns = list_cols
添加回答
舉報(bào)