首頁猿問對多個數(shù)據(jù)幀和返回語句進(jìn)行計(jì)算的更...

對多個數(shù)據(jù)幀和返回語句進(jìn)行計(jì)算的更好方法？

Python

叮當(dāng)貓咪 2023-06-27 17:34:06

我的函數(shù)查看 3 個數(shù)據(jù)幀，在不同日期之間進(jìn)行過濾，并創(chuàng)建一個語句。正如您所看到的，該函數(shù)一遍又一遍地重復(fù)使用相同的步驟，我想減少它們。我相信使用 afor-loop會有所幫助，但我不確定如何return像現(xiàn)在這樣在一小段中做出陳述def stat_generator(df,date1,date2,df2,date3,date4,df4,date5,date6): ##First Date Filter for First Dataframe, and calculations for first dataframe df['Announcement Date'] = pd.to_datetime(df['Announcement Date']) mask = ((df['Announcement Date'] >= date1) & (df['Announcement Date'] <= date2)) df_new = df.loc[mask] total = len(df_new) better = df_new[(df_new['performance'] == 'better')] better_perc = round(((len(better)/total)*100),2) worse = df_new[(df_new['performance'] == 'worse')] worse_perc = round(((len(worse)/total)*100),2) statement1 = "During the time period between {} and {}, {} % of the students performed better. {} % of the students performed worse" .format(date1,date2,better_perc,worse_perc) ##Second Date Filter for Second Dataframe, and calculations for second dataframe df2['Announcement Date'] = pd.to_datetime(df2['Announcement Date']) mask2 = ((df2['Announcement Date'] >= date3) & (df2['Announcement Date'] <= date4)) df_new2 = df2.loc[mask2] total2 = len(df_new2) better2 = df_new2[(df_new2['performance'] == 'better')] better_perc2 = round(((len(better2)/total2)*100),2) worse2 = df_new2[(df_new2['performance'] == 'worse')] worse_perc2 = round(((len(worse2)/total2)*100),2) statement2 = "During the time period between {} and {}, {} % of the students performed better. {} % of the students performed worse" .format(date3,date4,better_perc2,worse_perc2) ##Third Date Filter for Third Dataframe, and calculations for third dataframe

查看完整描述

2 回答

www說

TA貢獻(xiàn)1775條經(jīng)驗(yàn) 獲得超8個贊

我只需將 3 個參數(shù)傳遞給您的函數(shù)，即 df、date1 和 date2，然后調(diào)用您的函數(shù) 3 次。

def stat_generator(df,date1,date2):

"..."

return statement

然后將您的數(shù)據(jù)作為列表列表或類似的內(nèi)容傳遞。例如：

data = [[df,date1,date2],[df2,date3,date4],[df4,date5,date6]]

for lists in data:

stat_generator(*lists)

反對回復(fù) 2023-06-27

尚方寶劍之說

TA貢獻(xiàn)1788條經(jīng)驗(yàn) 獲得超4個贊

維持現(xiàn)有形式

df將中的參數(shù)更改stat_generator為df1，因此df可以在中使用for-loop。
將每個數(shù)據(jù)幀的數(shù)據(jù)分組在一起
創(chuàng)建一個statements列表，待返回
date1anddate2改為d1andd2在循環(huán)中
更新statement1為使用更易于閱讀的f-string.
我認(rèn)為這些更新需要對整體代碼進(jìn)行最少的更改。
可選：
- 更改mask為mask = df['Announcement Date'].between(d1, d2, inclusive=True)

def stat_generator(df1, date1 ,date2 ,df2 ,date3 ,date4 ,df4 ,date5 ,date6):?

? ? ##First Date Filter for First Dataframe, and calculations for first dataframe

? ??

? ? # create groups

? ? groups = [(df1, date1, date2), (df2, date3, date4), (df3, date5, date6)]

? ??

? ? # create a statements list for each statement

? ? statements = list()

? ??

? ? # iterate through each group

? ? for (df, d1, d2) in groups:

? ??

? ? ? ? df['Announcement Date'] = pd.to_datetime(df['Announcement Date'])

? ? ? ? mask = ((df['Announcement Date'] >= d1) & (df['Announcement Date'] <= d2))

? ? ? ? df_new = df.loc[mask]

? ? ? ? total = len(df_new)

? ? ? ? better = df_new[(df_new['performance'] == 'better')]

? ? ? ? better_perc = round(((len(better)/total)*100),2)

? ? ? ? worse = df_new[(df_new['performance'] == 'worse')]

? ? ? ? worse_perc = round(((len(worse)/total)*100),2)

? ? ? ? statement1 = f"During the time period between {d1} and {d2}, {better_perc}% of the students performed better. {worse_perc}%? of the students performed worse"

? ? ? ??

? ? ? ? # append the statement of the dataframe

? ? ? ? statements.append(statement1)

? ? # return a list of all the statements? ??

? ? return statements

完全重寫

該函數(shù)最好只做一件事，即提取并返回?cái)?shù)據(jù)。
負(fù)責(zé)將多個數(shù)據(jù)幀傳遞到函數(shù)外部的函數(shù)，并將結(jié)果收集在一個list或多個數(shù)據(jù)print幀中。
better為和創(chuàng)建新的數(shù)據(jù)框效率不高worse。
- 使用.value_counts()withnormalize=True來獲取百分比。

def stat_generator(df: pd.DataFrame, d1: str, d2: str) -> str:?

? ? ? ? ? ?

? ? df['Announcement Date'] = pd.to_datetime(df['Announcement Date'])

? ? # create the mask

? ? mask = df['Announcement Date'].between(d1, d2, inclusive=True)

? ? # apply the mask

? ? df_new = df.loc[mask]

? ? # calculate the percentage

? ? per = (df_new.performance.value_counts(normalize=True) * 100).round(2)

? ? return f"During the time period between {d1} and {d2}, {per['better']}% of the students performed better. {per['worse']}%? of the students performed worse"

groups = [(df1, date1, date2), (df2, date3, date4), (df3, date5, date6)]

statements = list()

for group in groups:

? ? statements.append(stat_generator(*group))

反對回復(fù) 2023-06-27

2 回答
0 關(guān)注
193 瀏覽

關(guān)注

添加回答

舉報(bào)

0/150

提交

取消

使用 Ctrl+D 可將網(wǎng)站添加到書簽

微信客服

購課補(bǔ)貼
聯(lián)系客服咨詢優(yōu)惠詳情

幫助反饋 APP下載

慕課網(wǎng)APP
您的移動學(xué)習(xí)伙伴

公眾號

掃描二維碼
關(guān)注慕課網(wǎng)微信公眾號

第七色在线视频,2021少妇久久久久久久久久,亚洲欧洲精品成人久久av18,亚洲国产精品特色大片观看完整版,孙宇晨将参加特朗普的晚宴

熱搜

最近搜索清空

對多個數(shù)據(jù)幀和返回語句進(jìn)行計(jì)算的更好方法？

對多個數(shù)據(jù)幀和返回語句進(jìn)行計(jì)算的更好方法？

2 回答

維持現(xiàn)有形式

完全重寫

添加回答

對多個數(shù)據(jù)幀和返回語句進(jìn)行計(jì)算的更好方法？