3 回答

TA貢獻(xiàn)1817條經(jīng)驗(yàn) 獲得超14個(gè)贊
您可以使用get_dummies此處有效地執(zhí)行此操作:
dummies = (df['allies'].str.get_dummies(sep=', ')
.reindex(df['country'].unique(), axis=1)
.add_suffix('_ally'))
df.join(dummies)
country allies USA_ally China_ally Singapore_ally
0 USA Turkey, UK, France, India 0 0 0
1 China DPRK, Singapore 0 0 1
2 Singapore USA, China 1 1 0
在哪里,
dummies
USA_ally China_ally Singapore_ally
0 0 0 0
1 0 0 1
2 1 1 0

TA貢獻(xiàn)1813條經(jīng)驗(yàn) 獲得超2個(gè)贊
讓我們?cè)囋囘@個(gè),用它series.unique來(lái)識(shí)別獨(dú)特的國(guó)家,然后str.contains檢查它是否存在。
for c in df.country.unique():
df[f'{c}_Aally'] = df.allies.str.contains(c).astype(int)
df
Out[20]:
country allies USA_Aally China_Aally Singapore_Aally
0 USA Turkey, UK, France, India 0 0 0
1 China DPRK, Singapore 0 0 1
2 Singapore USA, China 1 1 0

TA貢獻(xiàn)2016條經(jīng)驗(yàn) 獲得超9個(gè)贊
這是您的代碼的概括,首先獲取列中出現(xiàn)的所有唯一字母letter,然后分別循環(huán)遍歷它們并基本上對(duì)每個(gè)字母執(zhí)行您在上面所做的事情。
complete_letter_set = set(''.join(df['letter'])
for l in complete_letter_set:
df[f"letter{l}exists"] = df['letter'].map(lambda x: int(l in x))
請(qǐng)注意,我已將條件簡(jiǎn)化1 if A in x else 0為 just int(l in x),因?yàn)閕nt(True) == 1無(wú)論如何int(False) == 0。
添加回答
舉報(bào)