1 回答

TA貢獻1811條經(jīng)驗 獲得超5個贊
創(chuàng)建一個均勻間隔的日期時間索引,將其應用于您的數(shù)據(jù),并使用均勻間隔的索引對數(shù)據(jù)框進行滾動求和。由于這將在 numpy/pandas 中發(fā)生,因此它比對數(shù)據(jù)進行 Python 循環(huán)要快得多。
使用示例中的數(shù)據(jù)并假設毫秒間隔:
df = """2020-04-01 00:03:48.197028\t1
2020-04-01 00:24:07.186631\t11
2020-04-01 00:24:07.200361\t5
2020-04-01 00:24:07.204382\t1
2020-04-01 00:24:07.208525\t13"""
# Reading the sample dataframe
from io import StringIO
mfile = StringIO(df)
adf = pd.read_csv(mfile, sep="\t")
adf.columns = ['mtimestamp', 'mnumber']
adf.mtimestamp = pd.to_datetime(adf.mtimestamp)
# Creating a proper datetime index
adf = adf.set_index(pd.DatetimeIndex(adf['mtimestamp']))
adf = adf.drop(columns='mtimestamp')
# Resampling and summing
adf.resample('1ms').sum()
產(chǎn)量
mnumber
mtimestamp
2020-04-01 00:24:07.186 11
2020-04-01 00:24:07.187 0
2020-04-01 00:24:07.188 0
添加回答
舉報