1 回答

TA貢獻(xiàn)1752條經(jīng)驗(yàn) 獲得超4個(gè)贊
您需要讀入數(shù)據(jù)并轉(zhuǎn)換為日期時(shí)間格式 - 我用剪貼板讀入數(shù)據(jù)并在那里解析日期。其次,您需要按鍵對(duì)數(shù)據(jù)進(jìn)行排序(在這種情況下,鍵是 df1 的“連接”和 df2 的“開(kāi)始”)。在那之后 pandas merge_asof就足夠了。請(qǐng)注意,合并只能在一個(gè)鍵上發(fā)生,而不是多個(gè):
對(duì)數(shù)據(jù)框進(jìn)行排序
df1 = df1.sort_values(['Connect','Ended'])
df2 = df2.sort_values(['Start','End'])
合并數(shù)據(jù)框
merger = pd.merge_asof(df1,df2,
left_on='Connect',
right_on='Start',
tolerance = pd.Timedelta('20s'),
direction='forward')
merger
Connect Ended Start End
0 2020-03-31 11:00:08 2020-03-31 11:00:10 2020-03-31 11:00:10 2020-03-31 11:00:14
1 2020-04-01 22:00:05 2020-04-01 12:00:05 NaT NaT
2 2020-04-06 13:15:21 2020-04-06 14:05:18 2020-04-06 13:15:21 2020-04-06 14:05:18
應(yīng)該很容易選擇匹配和不匹配的行:
matched = merger.dropna()
matched
Connect Ended Start End
0 2020-03-31 11:00:08 2020-03-31 11:00:10 2020-03-31 11:00:10 2020-03-31 11:00:14
2 2020-04-06 13:15:21 2020-04-06 14:05:18 2020-04-06 13:15:21 2020-04-06 14:05:18
unmatched = merger.loc[merger.isna().any(axis=1)]
unmatched
Connect Ended Start End
1 2020-04-01 22:00:05 2020-04-01 12:00:05 NaT NaT
希望它就足夠了......如果你被踩到,文檔有更多的例子來(lái)指導(dǎo)你
添加回答
舉報(bào)