我想要獲取平均值、最小值、最大值等值。標準差。對于使用 k-means 方法計算的每組簇。下面的代碼正確嗎? import pandas as pd from sklearn.cluster import KMeans dataset = pd.read_csv("C:/Users/../cardio_train_py.csv", sep=';') clusterDB_1 = dataset[['Age','BMI','cardio']].copy() kmeans = KMeans(n_clusters=8).fit(clusterDB_1) X=[0,1,2,3,4,5,6,7] print('Age mean() for each cluster') for x in X: check = clusterDB_1[kmeans.labels_ == x] print(check['Age'].mean()) print('BMI mean() for each cluster') for x in X: check = clusterDB_1[kmeans.labels_ == x] print(check['BMI'].mean()) print('cardio == 0 count() for each cluster') for x in X: check = clusterDB_1[kmeans.labels_ == x] print(len(check[check['cardio'] == 1]))我問這個是因為獲得的值(例如年齡和BMI的平均值以及有氧運動計數(shù)== 0)與Statistica中獲得的值不同(照片顯示了程序Statistica結果的結果)下面是BMI的結果( Python計算)24.46858773626099624.04785593330728230.54886546867411631.9841046300499332.89129084635681166.5735714285714641.9784573748308524.16813400017246這是我的數(shù)據(jù)庫=> https://www.easypaste.org/file/JcyGhA8Y/cardio.train.py.csv?lang=pl感謝您的所有幫助和提示:)
添加回答
舉報
0/150
提交
取消