我有一個(gè)類似于以下內(nèi)容的數(shù)據(jù)幀date mood score count avg abs23/12/18 negative -50.893 137 -0.371 50.89323/12/18 neutral 0.2193 10 0.0219 0.219323/12/18 positive 336.5098 673 0.5000 336.509824/12/18 positive 91.2414 232 -0.393 91.241424/12/18 neutral 0.063 14 0.0045 0.06324/12/18 negative -649.697 1184 0.5487 649.69725/12/18 negative -72.4142 8 -9.0517 72.414225/12/18 positive 0 0 0 025/12/18 neutral 323.0056 173 1.86708 323.005626/12/18 negative -12.0467 15 -.8031 12.0467我想將以下條件應(yīng)用于此數(shù)據(jù)集。Con: if the absolute value(abs) score on a date is the greatest (of 3 moods), keep that date only together with its other attributes. Con: No duplicate date is to be kept. So the dataset will be reduced quite a lot compared to its original size.預(yù)期輸出date mood_corrected score count avg abs23/12/18 positive 336.5098 673 0.50001456 336.509824/12/18 negative 649.697 1184 0.54873057 649.69725/12/18 neutral 323.0056 173 1.86708439 323.005626/12/18 negative -12.0467 15 -0.8031 12.0467我的代碼import pandas as pd df =pd.read_csv('file.csv')new_df= df.sort_values('abs', ascending=False).drop_duplicates(['date','mood'])雖然我得到的結(jié)果是根據(jù)絕對(duì)值**(abs)**對(duì)數(shù)據(jù)集進(jìn)行排序,但我仍然擁有完整的數(shù)據(jù)集。它不會(huì)減少。任何幫助是值得贊賞的。非常感謝。注意:我查看了堆棧溢出,但沒有發(fā)現(xiàn)類似的問題。
1 回答

楊__羊羊
TA貢獻(xiàn)1943條經(jīng)驗(yàn) 獲得超7個(gè)贊
以下將完成這項(xiàng)工作!
new_df = df.sort_values('abs', ascending=False).drop_duplicates(['date']).sort_values('date')
添加回答
舉報(bào)
0/150
提交
取消