3 回答

TA貢獻(xiàn)1783條經(jīng)驗(yàn) 獲得超4個(gè)贊
IIUC 您正在尋找cumcount:
df["count"] = df.groupby(['Building ID', 'Assessment Phase']).cumcount()+1
print (df)
Building ID Assessment Phase count
0 1 Phase 1 1
1 2 Phase 2 1
2 2 Phase 2 2
3 3 Phase 3 1
4 3 Phase 3 2
5 3 Phase 3 3
6 4 Unk 1
7 4 Phase 1 1
8 5 Phase 2 1

TA貢獻(xiàn)1803條經(jīng)驗(yàn) 獲得超6個(gè)贊
首先,創(chuàng)建數(shù)據(jù)框:
from io import StringIO
import pandas as pd
data = ''' Building ID Assessment Phase
001 Phase 1
002 Phase 2
002 Phase 2
003 Phase 3
003 Phase 2
003 Phase 3
004 Unk
004 Phase 1
005 Phase 2
df = pd.read_csv(StringIO(data), sep='\s\s+', engine='python')
'''
其次,創(chuàng)建一個(gè)名為“計(jì)數(shù)器”的輔助列(0 表示未知評(píng)估階段,否則為 1):
df['counter'] = 1
mask = df['Assessment Phase'] == 'Unk'
df.loc[mask, 'counter'] = 0
第三,按建筑物ID分組,并將cumsum(累積和)函數(shù)應(yīng)用于計(jì)數(shù)器列。然后手動(dòng)更新“未知”行。
df['Bldg_Phs_Ord'] = df.groupby('Building ID')['counter'].cumsum()
df.loc[mask, 'Bldg_Phs_Ord'] = 1
print(df)
Building ID Assessment Phase counter Bldg_Phs_Ord
0 1 Phase 1 1 1
1 2 Phase 2 1 1
2 2 Phase 2 1 2
3 3 Phase 3 1 1
4 3 Phase 2 1 2
5 3 Phase 3 1 3
6 4 Unk 0 1
7 4 Phase 1 1 1
8 5 Phase 2 1 1
我不知道如何避免對(duì)“Unk”評(píng)估階段的特殊處理。并且cumsum()對(duì)數(shù)據(jù)框的初始順序敏感。

TA貢獻(xiàn)1784條經(jīng)驗(yàn) 獲得超8個(gè)贊
假設(shè)df
是您的輸入數(shù)據(jù)框,請(qǐng)嘗試:
df['COUNT'] = df.groupby(['Building ID', 'Assessment Phase']).cumcount().add(1)
cumcount
不會(huì)減少行數(shù)。
添加回答
舉報(bào)