首頁(yè) 猿問(wèn) 使用函數(shù)參數(shù)過(guò)濾 CSV 文件

使用函數(shù)參數(shù)過(guò)濾 CSV 文件

Python

慕運(yùn)維8079593 2023-09-26 14:10:14

所以我正在編寫(xiě)一個(gè)函數(shù)來(lái)根據(jù)函數(shù)參數(shù)過(guò)濾 csv 文件，然后在過(guò)濾后找到一列的平均值。我只允許使用 import csv （沒(méi)有 pandas）并且不能使用 lambda 或任何其他 python“高級(jí)”快捷方式。我覺(jué)得我可以輕松獲得平均部分，但我在根據(jù)我提到的參數(shù)和約束對(duì)其進(jìn)行過(guò)濾時(shí)遇到了麻煩。我通常會(huì)使用 pandas 來(lái)解決這個(gè)問(wèn)題，這使得這個(gè)過(guò)程更容易，但我不能。這是我的代碼：def calc_avg(self, specific, filter, logic, threshold): with open(self.load_data, 'r') as avg_file: for row in csv.DictReader(avg_file, delimiter= ','): specific = row[specific] filter = int(row[filter]) logic = logic threshold = 0 if logic == 'lt': filter < threshold elif logic == 'gt': filter > threshold elif logic == 'lte': filter <= threshold elif logic == 'gte': filter >= threshold 它應(yīng)該與這個(gè)命令一起使用print(csv_data.calc_avg("Length_of_stay", filter="SOFA", logic="lt", threshold="15"))這是代碼和列標(biāo)題的格式。樣本數(shù)據(jù)：RecordID SAPS-I SOFA Length_of_stay 132539 6 1 5 132540 16 8 8 132541 21 11 19 132545 17 2 4 132547 14 11 6 132548 14 4 9 132551 19 8 6 132554 11 0 17

查看完整描述

2 回答

狐的傳說(shuō)

TA貢獻(xiàn)1804條經(jīng)驗(yàn) 獲得超3個(gè)贊

更新

此選項(xiàng)計(jì)算一次并返回一個(gè)可在迭代行時(shí)使用的logic函數(shù)。compare當(dāng)數(shù)據(jù)有很多行時(shí)，速度會(huì)更快。

# written as a function because you don't share the definition of load_data

# but the main idea can be translated to a class

def calc_avg(self, specific, filter, logic, threshold):

if isinstance(threshold, str):

threshold = float(threshold)

def lt(a, b): return a < b

def gt(a, b): return a > b

def lte(a, b): return a <= b

def gte(a, b): return a >= b

if logic == 'lt': compare = lt

elif logic == 'gt': compare = gt

elif logic == 'lte': compare = lte

elif logic == 'gte': compare = gte

with io.StringIO(self) as avg_file: # change to open an actual file

running_sum = running_count = 0

for row in csv.DictReader(avg_file, delimiter=','):

if compare(int(row[filter]), threshold):

running_sum += int(row[specific])

# or float(row[specific])

running_count += 1

if running_count == 0:

# no even one row passed the filter

return 0

else:

return running_sum / running_count

print(calc_avg(data, 'Length_of_stay', 'SOFA', 'lt', '15'))

print(calc_avg(data, 'Length_of_stay', 'SOFA', 'lt', '2'))

print(calc_avg(data, 'Length_of_stay', 'SOFA', 'lt', '0'))

輸出

9.25

11.0

0

初步答復(fù)

為了過(guò)濾行，一旦確定應(yīng)該使用哪種類(lèi)型的不等式，就必須進(jìn)行比較。這里的代碼將其存儲(chǔ)在 boolean 中include。

然后你可以有兩個(gè)變量：running_sum和running_count稍后應(yīng)該除以返回平均值。

import io

import csv

# written as a function because you don't share the definition of load_data

# but the main idea can be translated to a class

def calc_avg(self, specific, filter, logic, threshold):

if isinstance(threshold, str):

threshold = float(threshold)

with io.StringIO(self) as avg_file: # change to open an actual file

running_sum = running_count = 0

for row in csv.DictReader(avg_file, delimiter=','):

# your code has: filter = int(row[filter])

value = int(row[filter]) # avoid overwriting parameters

if logic == 'lt' and value < threshold:

include = True

elif logic == 'gt' and value > threshold:

include = True

elif logic == 'lte' and value <= threshold: # should it be 'le'

include = True

elif logic == 'gte' and value >= threshold: # should it be 'ge'

include = True

# or import ast and consider all cases in one line

# if ast.literal_eval(f'{value}{logic}{treshold}'):

# include = True

else:

include = False

if include:

running_sum += int(row[specific])

# or float(row[specific])

running_count += 1

return running_sum / running_count

data = """RecordID,SAPS-I,SOFA,Length_of_stay

132539,6,1,5

132540,16,8,8

132541,21,11,19

132545,17,2,4

132547,14,11,6

132548,14,4,9

132551,19,8,6

132554,11,0,17"""

print(calc_avg(data, 'Length_of_stay', 'SOFA', 'lt', '15'))

print(calc_avg(data, 'Length_of_stay', 'SOFA', 'lt', '2'))

輸出

9.25

11.0

反對(duì) 回復(fù) 2023-09-26

陪伴而非守候

TA貢獻(xiàn)1757條經(jīng)驗(yàn) 獲得超8個(gè)贊

您沒(méi)有對(duì)比較結(jié)果做任何事情。您需要在if報(bào)表中使用它們以將特定值包含在平均值計(jì)算中。

def calc_avg(self, specific, filter, logic, threshold):

with open(self.load_data, 'r') as avg_file:

values = []

for row in csv.DictReader(avg_file, delimiter= ','):

specific = row[specific]

filter = int(row[filter])

threshold = 0

if logic == 'lt' and filter < threshold:

values.append(specific)

elif logic == 'gt' and filter > threshold:

values.append(specific)

elif logic == 'lte' and filter <= threshold:

values.append(specific)

elif logic == 'gte' and filter >= threshold:

values.append(specific)

if len(values) > 0:

return sum(values) / len(values)

else:

return 0

反對(duì) 回復(fù) 2023-09-26

2 回答
0 關(guān)注
162 瀏覽

關(guān)注

添加回答

舉報(bào)

0/150

提交

取消

使用 Ctrl+D 可將網(wǎng)站添加到書(shū)簽

微信客服

購(gòu)課補(bǔ)貼
聯(lián)系客服咨詢優(yōu)惠詳情

幫助反饋 APP下載

慕課網(wǎng)APP
您的移動(dòng)學(xué)習(xí)伙伴

公眾號(hào)

掃描二維碼
關(guān)注慕課網(wǎng)微信公眾號(hào)

第七色在线视频,2021少妇久久久久久久久久,亚洲欧洲精品成人久久av18,亚洲国产精品特色大片观看完整版,孙宇晨将参加特朗普的晚宴

熱搜

最近搜索清空

使用函數(shù)參數(shù)過(guò)濾 CSV 文件

使用函數(shù)參數(shù)過(guò)濾 CSV 文件

2 回答

添加回答