上下文:WiFi DHCP 租約記錄在系統(tǒng)日志中。DHCP 租用期限為 24 小時(shí),一條記錄代表一次成功的 WiFi“會(huì)話”。有些設(shè)備(例如手機(jī))每天會(huì)啟動(dòng)多個(gè)會(huì)話,因此我們只需每 24 小時(shí)計(jì)算唯一的 Mac 地址。但我們想知道第一次連接的時(shí)間戳。最后,我們需要能夠按小時(shí)和天進(jìn)行求和。TL;DR:需要顯示按小時(shí)細(xì)分的每日唯一 MAC 地址。不是按小時(shí)唯一,而是按天……然后按小時(shí)細(xì)分并求和。示例數(shù)據(jù)框:branch timestamp mac 0 branch_a 2020-09-01 00:00:00 48:c7:96:1d:91:af1 branch_a 2020-09-01 00:08:00 48:c7:96:1d:91:bx 2 branch_b 2020-09-01 00:36:07 48:c7:96:1d:80:ff 3 branch_b 2020-09-01 00:41:24 48:c7:96:1d:86:ff 4 branch_c 2020-09-01 00:44:33 48:c7:96:1d:76:bv腳步:按分支分組每天首次出現(xiàn)或唯一的 MAC 地址按小時(shí)計(jì)算 mac 地址總和這顯示了相同的 mac。branch_daily = wifi.groupby(['branch','month', 'timestamp'])['mac'].first()預(yù)期結(jié)果:branch timestamp mac 0 branch_a 2020-09-01 00:00:00 51 branch_a 2020-09-01 00:01:00 10 2 branch_a 2020-09-01 00:02:00 3 3 branch_a 2020-09-01 00:03:00 4 4 branch_a 2020-09-01 00:04:00 11其中 mac 是按小時(shí)計(jì)算的總和。wifi['timestamp'] = pd.to_datetime(wifi['timestamp'], format='%b %d %Y %H:%M:%S')wifi['month'] = wifi['timestamp'].dt.monthwifi['day'] = wifi['timestamp'].dt.daywifi['hour'] = wifi['timestamp'].dt.houruniq_per_day = wifi.drop_duplicates(subset=['day','mac'], keep='first')# Hourlyuniq_per_day.groupby(['branch','month','day','hour']).agg({'mac':'count'})# Dailyuniq_per_day.groupby(['branch','month','day']).agg({'mac':'count'})#...etc.
Pandas:WiFi 日志中每小時(shí)的唯一每日值
明月笑刀無(wú)情
2023-10-18 20:59:03