我下面有一個非?;镜哪_本來演示該問題:from imblearn.over_sampling import ADASYNimport pandas as pd, numpy as npfrom sklearn.ensemble import RandomForestClassifierfrom sklearn.model_selection import train_test_splitdata = pd.read_csv('glass.csv')classes = data.values[:, -1]data = data.iloc[:, :-1]adasyn = ADASYN(sampling_strategy='not majority', random_state=8, n_neighbors=3)new_data, new_classes = adasyn.fit_resample(data, classes)X_train, X_test, y_train, y_test = train_test_split(new_data, new_classes, test_size = 0.20)rfc = RandomForestClassifier()rfc.fit(X_train, y_train)print("Score: {}".format(rfc.score(X_test, y_test)))目的是平衡以下類別的不平衡:(214,?10)
Class=1,?Count=70,?Percentage=32.710%
Class=2,?Count=76,?Percentage=35.514%
Class=3,?Count=17,?Percentage=7.944%
Class=5,?Count=13,?Percentage=6.075%
Class=6,?Count=9,?Percentage=4.206%
Class=7,?Count=29,?Percentage=13.551%擁有相等(或接近相等)的樣本。然而,運(yùn)行上面的代碼會產(chǎn)生:ValueError: No samples will be generated with the provided ratio settings.更改為成功地對類 進(jìn)行過采樣ADASYN,并將其帶入樣本,但仍然使其余類不平衡。因此,我正在尋找一種使用 ADASYN對所有少數(shù)類別進(jìn)行完全過采樣的方法。sampling_strategyminorityminority674ADASYN 文檔指出:?'not majority': resample all classes but the majority class;但這顯然沒有發(fā)生。
使用 ADASYN 算法對多類數(shù)據(jù)進(jìn)行過采樣失敗
慕的地8271018
2023-07-27 10:33:19