2 回答

TA貢獻1860條經(jīng)驗 獲得超9個贊
最簡單的是assign與字典解包一起使用**以添加新列,但需要一個單詞字符串列:
df1 = df1.assign(**df2.iloc[0])
print (df1)
A B C D LAT LON TIME
0 A0 B0 C0 D0 LAT0 LON0 T0
1 A1 B1 C1 D1 LAT0 LON0 T0
2 A2 B2 C2 D2 LAT0 LON0 T0
3 A3 B3 C3 D3 LAT0 LON0 T0
對于前置列另一個解決方案是使用reindex具有join:
df1 = df2.iloc[[0]].reindex(df1.index, method='ffill').join(df1)
print (df1)
LAT LON TIME A B C D
0 LAT0 LON0 T0 A0 B0 C0 D0
1 LAT0 LON0 T0 A1 B1 C1 D1
2 LAT0 LON0 T0 A2 B2 C2 D2
3 LAT0 LON0 T0 A3 B3 C3 D3
與DataFrame構(gòu)造函數(shù)非常相似:
df3 = pd.DataFrame(df2.iloc[0].to_dict(), index=df1.index)
print (df3)
LAT LON TIME
0 LAT0 LON0 T0
1 LAT0 LON0 T0
2 LAT0 LON0 T0
3 LAT0 LON0 T0
df1 = df3.join(df1)
print (df1)
LAT LON TIME A B C D
0 LAT0 LON0 T0 A0 B0 C0 D0
1 LAT0 LON0 T0 A1 B1 C1 D1
2 LAT0 LON0 T0 A2 B2 C2 D2
3 LAT0 LON0 T0 A3 B3 C3 D3
另一個 numpy 解決方案numpy.broadcast_to- 只有在并非所有列都具有相同類型(如字符串)時才小心,應(yīng)該應(yīng)用一些強制轉(zhuǎn)換:
df3 = pd.DataFrame(np.broadcast_to(df2.values, (len(df1),len(df2.columns))),
columns=df2.columns, index=df1.index)
print (df3)
LAT LON TIME
0 LAT0 LON0 T0
1 LAT0 LON0 T0
2 LAT0 LON0 T0
3 LAT0 LON0 T0
df1 = df3.join(df1)
print (df1)
LAT LON TIME A B C D
0 LAT0 LON0 T0 A0 B0 C0 D0
1 LAT0 LON0 T0 A1 B1 C1 D1
2 LAT0 LON0 T0 A2 B2 C2 D2
3 LAT0 LON0 T0 A3 B3 C3 D3
性能:
df1 = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],
'B': ['B0', 'B1', 'B2', 'B3'],
'C': ['C0', 'C1', 'C2', 'C3'],
'D': ['D0', 'D1', 'D2', 'D3']},
index=[0, 1, 2, 3])
#[400000 rows x 4 columns]
df1 = pd.concat([df1] * 100000, ignore_index=True)
df2 = pd.DataFrame({'LAT': ['LAT0'],
'LON': ['LON0'],
'TIME': ['T0']},
index=[0])
In [286]: %timeit df1.assign(**df2.iloc[0])
23 ms ± 642 μs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [287]: %timeit df2.iloc[[0]].reindex(df1.index, method='ffill').join(df1)
35.7 ms ± 3.78 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [288]: %timeit pd.DataFrame(df2.iloc[0].to_dict(), index=df1.index).join(df1)
54.7 ms ± 163 μs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [289]: %timeit pd.DataFrame(np.broadcast_to(df2.values, (len(df1),len(df2.columns))), columns=df2.columns, index=df1.index).join(df1)
27.8 ms ± 2.32 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
#bunji solution
In [290]: %timeit df1.join(df2, how='outer').fillna(method='ffill')
244 ms ± 19.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

TA貢獻1804條經(jīng)驗 獲得超8個贊
另一種選擇是:
df = df1.join(df2, how='outer').fillna(method='ffill')
print(df)
A B C D LAT LON TIME
0 A0 B0 C0 D0 LAT0 LON0 T0
1 A1 B1 C1 D1 LAT0 LON0 T0
2 A2 B2 C2 D2 LAT0 LON0 T0
3 A3 B3 C3 D3 LAT0 LON0 T0
請注意,how='outer'如果是唯一真正有必要df1擁有更少的行比df2,因為join做了左默認情況下加入。
添加回答
舉報