1 回答

TA貢獻(xiàn)1993條經(jīng)驗 獲得超6個贊
首先Series用Series.str.splitand創(chuàng)建DataFrame.stack:
s = df['station'].str.split(expand=True).stack()
min然后刪除以by boolean indexingwith結(jié)尾的值Series.str.endswith:
df1 = s[~s.str.endswith('min')].to_frame('data').rename_axis(('a','b'))
line然后為s 和為station具有過濾和 的行創(chuàng)建計數(shù)器GroupBy.cumcount:
df1['Line'] = (df1[df1['data'].str.endswith('line')]
.groupby(level=0)
.cumcount()
.add(1)
.astype(str))
df1['Line'] = df1['Line'].ffill()
df1['station'] = (df1[df1['data'].str.endswith('station')]
.groupby(['a','Line'])
.cumcount()
.add(1)
.astype(str))
使用連接創(chuàng)建系列,將缺失值替換df1['Line']為Series.fillna:
df1['station'] = (df1['Line'] + '-' + df1['station']).fillna(df1['Line'])
DataFrame.set_index通過重塑DataFrame.unstack:
df1 = df1.set_index('station', append=True)['data'].reset_index(level=1, drop=True).unstack()
Rename列名 - 之前不是為了避免錯誤排序:
df1 = df1.rename(columns = lambda x: 'Station' + x if '-' in x else 'Line' + x)
刪除列名:
df1.columns.name = None
df1.index.name = None
print (df1)
Line1 Station1-1 Station1-2 Station1-3 Line2 Station2-1
0 A-line B-station C-station NaN NaN NaN
1 D-line E-station NaN NaN F-line G-station
2 G-line H-station I-station J-station NaN NaN
添加回答
舉報