1 回答

TA貢獻(xiàn)1797條經(jīng)驗 獲得超4個贊
您可以DataFrame在函數(shù)中返回:
def z_score(x):
z = np.abs(stats.zscore(x))
c = np.where(x > 5, 1, 0)
return pd.DataFrame({'zscore':z,'label':c}, index=x.index)
df[['zscore','label']] = df.groupby(['GROUP'])['VALUE'].apply(z_score)
print (df)
GROUP VALUE zscore label
0 1 5 1.135550 0
1 2 2 1.000000 0
2 1 10 1.297771 1
3 2 20 1.000000 1
4 1 7 0.162221 1
但是為了獲得更好的性能,可以在 out of 之后更改groupbyfor scoreonly 和labelcolumn count 的代碼groupby:
def z_score(x):
z = np.abs(stats.zscore(x))
return z
df['zscore'] = df.groupby('GROUP')['VALUE'].transform(z_score)
#lambda function alternative
#df['zscore'] = df.groupby('GROUP')['VALUE'].transform(lambda x: np.abs(stats.zscore(x)))
df['label'] = np.where(df['VALUE'] > 5, 1, 0)
print (df)
GROUP VALUE zscore label
0 1 5 1.135550 0
1 2 2 1.000000 0
2 1 10 1.297771 1
3 2 20 1.000000 1
4 1 7 0.162221 1
添加回答
舉報