我有一個(gè) excel 文件,第一行總是空的。第二行包含我不需要的數(shù)據(jù)。第 3 行始終是標(biāo)題,接下來(lái)的行始終是數(shù)據(jù),且位于和Total之下。Title_3Title_4我使用pandas. 我附上了結(jié)果的輸出。我的目標(biāo)是我希望數(shù)組中的所有內(nèi)容都是字符串和 nan。如何用字符串替換 nan 并顯示如下輸出:目標(biāo)輸出['nan', 'Title_1', 'RED_100', '2019-01-01 00:00:00', '10', 'nan']['nan', 'Title_2', 'GREEN_200', '2018-02-02 00:00:00', '20', 'nan']['nan', 'Title_3', 'RED_300', '2019-11-15 00:00:00', '30', 'Total']['123456', 'Title_4', 'YELLOW_100', '2019-01-01 00:00:00', '40', '100']代碼import pandas as pdimport ioimport numpy as nppath = r'C:\Temp Files\Excel_2.xlsx'df = pd.read_excel(path, dtype=str, index_col=None, na_values=['NA'])#df.drop(df.head(2).index, inplace=True)print(df)res = (df.dropna(how='all') #remove completely empty rows.dropna(how='all',axis=1) #remove completely empty columns.T #flip columns into row position#convert to list .to_numpy().tolist())print()Title_1 = res[1]print(Title_1)輸出 Unnamed: 0 Unnamed: 1 Unnamed: 2 Unnamed: 30 NaN NaN NaN 1234561 Title_1 Title_2 Title_3 Title_42 RED_100 GREEN_200 RED_300 YELLOW_1003 2019-01-01 00:00:00 2018-02-02 00:00:00 2019-11-15 00:00:00 2019-01-01 00:00:004 10 20 30 405 NaN NaN Total 100[nan, 'Title_2', 'GREEN_200', '2018-02-02 00:00:00', '20', nan]
1 回答

一只萌萌小番薯
TA貢獻(xiàn)1795條經(jīng)驗(yàn) 獲得超7個(gè)贊
#add skiprows=1, nrows=6
df = pd.read_excel(path, dtype=str, index_col=None, na_values=['NA'], skiprows=1, nrows=6)
#transpose the df
df_transposed = df.T
#transform all entries to strings (including nan)
df_transposed = df_transposed.applymap(str)
您已經(jīng)努力為問(wèn)題提供信息,但如果您還提供了這樣的測(cè)試數(shù)據(jù)框,那將非常有幫助:df = pd.DataFrame(data=... 因此代碼未經(jīng)測(cè)試!
添加回答
舉報(bào)
0/150
提交
取消