首頁猿問比較DataFrames /...

比較DataFrames / csv并僅返回具有差異的列，包括Key值

Python

千萬里不及你 2021-04-09 18:15:25

我有兩個(gè)CSV文件，我正在比較并僅并排返回具有不同值的列。df1Country 1980 1981 1982 1983 1984Bermuda 0.00793 0.00687 0.00727 0.00971 0.00752Canada 9.6947 9.58952 9.20637 9.18989 9.78546Greenland 0.00791 0.00746 0.00722 0.00505 0.00799Mexico 3.72819 4.11969 4.33477 4.06414 4.18464df2Country 1980 1981 1982 1983 1984Bermuda 0.77777 0.00687 0.00727 0.00971 0.00752Canada 9.6947 9.58952 9.20637 9.18989 9.78546Greenland 0.00791 0.00746 0.00722 0.00505 0.00799Mexico 3.72819 4.11969 4.33477 4.06414 4.18464import pandas as pdimport numpy as npdf1=pd.read_csv('csv1.csv')df2=pd.read_csv('csv2.csv')def diff_pd(df1, df2): """Identify differences between two pandas DataFrames""" assert (df1.columns == df2.columns).all(), \ "DataFrame column names are different" if any(df1.dtypes != df2.dtypes): "Data Types are different, trying to convert" df2 = df2.astype(df1.dtypes) if df1.equals(df2): print("Dataframes are the same") return None else: # need to account for np.nan != np.nan returning True diff_mask = (df1 != df2) & ~(df1.isnull() & df2.isnull()) ne_stacked = diff_mask.stack() changed = ne_stacked[ne_stacked] changed.index.names = ['Country', 'Column'] difference_locations = np.where(diff_mask) changed_from = df1.values[difference_locations][0] changed_to = df2.values[difference_locations] y=pd.DataFrame({'From': changed_from, 'To': changed_to}, index=changed.index) print(y) return pd.DataFrame({'From': changed_from, 'To': changed_to}, index=changed.index)diff_pd(df1,df2)我當(dāng)前的輸出是： From ToCountry Column 0 1980 0.00793 0.77777因此，我想獲得索引值不匹配的行的國(guó)家/地區(qū)名稱，而不是索引0。下面是一個(gè)例子。我希望我的輸出是： From ToCountry Column Bermuda 1980 0.00793 0.77777謝謝所有能提供解決方案的人。

查看完整描述

1 回答

函數(shù)式編程

TA貢獻(xiàn)1807條經(jīng)驗(yàn) 獲得超9個(gè)贊

一種更短的方法，在此過程中會(huì)重命名：

def process_df(df):

res = df.set_index('Country').stack()

res.index.rename('Column', level=1, inplace=True)

return res

df1 = process_df(df1)

df2 = process_df(df2)

mask = (df1 != df2) & ~(df1.isnull() & df2.isnull())

df3 = pd.concat([df1[mask], df2[mask]], axis=1).rename({0:'From', 1:'To'}, axis=1)

df3

From To

Country Column

Bermuda 1980 0.00793 0.77777

反對(duì) 回復(fù) 2021-04-20

1 回答
0 關(guān)注
172 瀏覽

關(guān)注

添加回答

舉報(bào)

0/150

提交

取消

使用 Ctrl+D 可將網(wǎng)站添加到書簽

微信客服

購(gòu)課補(bǔ)貼
聯(lián)系客服咨詢優(yōu)惠詳情

幫助反饋 APP下載

慕課網(wǎng)APP
您的移動(dòng)學(xué)習(xí)伙伴

公眾號(hào)

掃描二維碼
關(guān)注慕課網(wǎng)微信公眾號(hào)

第七色在线视频,2021少妇久久久久久久久久,亚洲欧洲精品成人久久av18,亚洲国产精品特色大片观看完整版,孙宇晨将参加特朗普的晚宴

熱搜

最近搜索清空

比較DataFrames / csv并僅返回具有差異的列，包括Key值

比較DataFrames / csv并僅返回具有差異的列，包括Key值

1 回答

添加回答

比較DataFrames / csv并僅返回具有差異的列，包括Key值