4 回答

TA貢獻1890條經驗 獲得超9個贊
#having dataframe x:
>>> x = pd.DataFrame([['PartNo',12],['Meas1',45],['Meas2',23],['!END',''],['PartNo',13],['Meas1',63],['Meas2',73],['!END',''],['PartNo',12],['Meas1',82],['Meas2',84],['!END','']])
>>> x
0 1
0 PartNo 12
1 Meas1 45
2 Meas2 23
3 !END
4 PartNo 13
5 Meas1 63
6 Meas2 73
7 !END
8 PartNo 12
9 Meas1 82
10 Meas2 84
11 !END
#grouping by first column, and aggregating values to list. First column then contains Series that you want. By converting each list in this series to series, dataframe is created, then you just need to transpose
>>> df = x.groupby(0).agg(lambda x: list(x))[1].apply(lambda x: pd.Series(x)).transpose()
>>> df[['PartNo','Meas1','Meas2']]
0 PartNo Meas1 Meas2
0 12 45 23
1 13 63 73
2 12 82 84

TA貢獻1851條經驗 獲得超3個贊
這是我會怎么做。我會將文件解析為任何文本文件,然后根據我需要的字段創(chuàng)建記錄。我會使用 '!END' 行作為行創(chuàng)建完成的指示器,將其寫入列表,然后最終將列表轉換為 DataFrame
import pandas as pd
filename='PartDetail.csv'
with open(filename,'r') as file:
LinesFromFile=file.readlines()
RowToWrite=[]
for EachLine in LinesFromFile:
ValuePosition=EachLine.find(" ")+1
CurrentAttrib=EachLine[0:ValuePosition-1]
if CurrentAttrib=='PartNo':
PartNo=EachLine[ValuePosition+1:len(EachLine)-1].strip()
if CurrentAttrib=='Meas1':
Meas1=EachLine[ValuePosition+1:len(EachLine)-1].strip()
if CurrentAttrib=='Meas2':
Meas2=EachLine[ValuePosition+1:len(EachLine)-1].strip()
if EachLine[0:4]=='!END':
RowToWrite.append([PartNo,Meas1,Meas2])
PartsDataDF=pd.DataFrame(RowToWrite,columns=['PartNo','Meas1','Meas2']) #Converting to DataFrame
這將為您提供一個更清晰的 DataFrame,如下所示:-

TA貢獻1827條經驗 獲得超4個贊
該文件不是 csv 文件,因此使用 csv 模塊解析它無法產生正確的輸出。它不是眾所周知的格式,所以我會使用自定義解析器:
with open(filename) as fd:
data = []
row = None
for line in fd:
line = line.strip()
if line == '!END':
row = None
else:
k,v = line.split(None, 1)
if row is None:
row = {k : v}
data.append(row)
else:
row[k] = v
header = set(i for row in data for i in row.keys())
df = pd.DataFrame(data, columns=header)

TA貢獻1853條經驗 獲得超6個贊
根據提供的信息,我認為你應該能夠使用這種方法實現你想要的:
df = df[df[0] != '!END']
out = df.groupby(0).agg(list).T.apply(lambda x: x.explode(), axis=0)
輸出:
0 Meas1 Meas2 PartNo
1 45 23 12
1 63 73 13
1 82 84 12
這基本上按 PartNo、Meas1 和 Meas2 鍵對原始 df 進行分組,并為每個列表創(chuàng)建一個列表。然后將每個列表分解為一個 pd.Series,從而為每個列表創(chuàng)建一個列,行數等于條目數每個鍵(都應該相同)
添加回答
舉報