4 回答

TA貢獻(xiàn)1890條經(jīng)驗(yàn) 獲得超9個(gè)贊
#having dataframe x:
>>> x = pd.DataFrame([['PartNo',12],['Meas1',45],['Meas2',23],['!END',''],['PartNo',13],['Meas1',63],['Meas2',73],['!END',''],['PartNo',12],['Meas1',82],['Meas2',84],['!END','']])
>>> x
0 1
0 PartNo 12
1 Meas1 45
2 Meas2 23
3 !END
4 PartNo 13
5 Meas1 63
6 Meas2 73
7 !END
8 PartNo 12
9 Meas1 82
10 Meas2 84
11 !END
#grouping by first column, and aggregating values to list. First column then contains Series that you want. By converting each list in this series to series, dataframe is created, then you just need to transpose
>>> df = x.groupby(0).agg(lambda x: list(x))[1].apply(lambda x: pd.Series(x)).transpose()
>>> df[['PartNo','Meas1','Meas2']]
0 PartNo Meas1 Meas2
0 12 45 23
1 13 63 73
2 12 82 84

TA貢獻(xiàn)1851條經(jīng)驗(yàn) 獲得超3個(gè)贊
這是我會(huì)怎么做。我會(huì)將文件解析為任何文本文件,然后根據(jù)我需要的字段創(chuàng)建記錄。我會(huì)使用 '!END' 行作為行創(chuàng)建完成的指示器,將其寫入列表,然后最終將列表轉(zhuǎn)換為 DataFrame
import pandas as pd
filename='PartDetail.csv'
with open(filename,'r') as file:
LinesFromFile=file.readlines()
RowToWrite=[]
for EachLine in LinesFromFile:
ValuePosition=EachLine.find(" ")+1
CurrentAttrib=EachLine[0:ValuePosition-1]
if CurrentAttrib=='PartNo':
PartNo=EachLine[ValuePosition+1:len(EachLine)-1].strip()
if CurrentAttrib=='Meas1':
Meas1=EachLine[ValuePosition+1:len(EachLine)-1].strip()
if CurrentAttrib=='Meas2':
Meas2=EachLine[ValuePosition+1:len(EachLine)-1].strip()
if EachLine[0:4]=='!END':
RowToWrite.append([PartNo,Meas1,Meas2])
PartsDataDF=pd.DataFrame(RowToWrite,columns=['PartNo','Meas1','Meas2']) #Converting to DataFrame
這將為您提供一個(gè)更清晰的 DataFrame,如下所示:-

TA貢獻(xiàn)1827條經(jīng)驗(yàn) 獲得超4個(gè)贊
該文件不是 csv 文件,因此使用 csv 模塊解析它無(wú)法產(chǎn)生正確的輸出。它不是眾所周知的格式,所以我會(huì)使用自定義解析器:
with open(filename) as fd:
data = []
row = None
for line in fd:
line = line.strip()
if line == '!END':
row = None
else:
k,v = line.split(None, 1)
if row is None:
row = {k : v}
data.append(row)
else:
row[k] = v
header = set(i for row in data for i in row.keys())
df = pd.DataFrame(data, columns=header)

TA貢獻(xiàn)1853條經(jīng)驗(yàn) 獲得超6個(gè)贊
根據(jù)提供的信息,我認(rèn)為你應(yīng)該能夠使用這種方法實(shí)現(xiàn)你想要的:
df = df[df[0] != '!END']
out = df.groupby(0).agg(list).T.apply(lambda x: x.explode(), axis=0)
輸出:
0 Meas1 Meas2 PartNo
1 45 23 12
1 63 73 13
1 82 84 12
這基本上按 PartNo、Meas1 和 Meas2 鍵對(duì)原始 df 進(jìn)行分組,并為每個(gè)列表創(chuàng)建一個(gè)列表。然后將每個(gè)列表分解為一個(gè) pd.Series,從而為每個(gè)列表創(chuàng)建一個(gè)列,行數(shù)等于條目數(shù)每個(gè)鍵(都應(yīng)該相同)
添加回答
舉報(bào)