1 回答

TA貢獻1804條經(jīng)驗 獲得超3個贊
雖然我沒有親自測試過這一點,但我相信您可以通過捕獲警告并檢查返回的捕獲警告列表的長度來計算生成的警告數(shù)量。然后將其添加到數(shù)據(jù)框的當(dāng)前形狀:
import warnings
import pandas as pd
with warnings.catch_warnings(record=True) as warning_list:
someDataframe = pandas.read_csv(
filepath_or_buffer=our_filepath_here,
error_bad_lines=False,
warn_bad_lines=True
)
# May want to check if each warning object a pandas "bad line warning"
number_of_warned_lines = len(warning_list)
initialRowCount = len(someDataframe) + number_of_warned_lines
https://docs.python.org/3/library/warnings.html#warnings.catch_warnings
編輯:花了一點時間,但這似乎適用于 Pandas。我們將暫時重定向,而不是依賴內(nèi)置警告stderr。然后我們可以計算該字符串中出現(xiàn)“Skipping Lines”的次數(shù),并以帶有此警告消息的壞行數(shù)結(jié)束!
import contextlib
import io
bad_data = io.StringIO("""
a,b,c,d
1,2,3,4
f,g,h,i,j,
l,m,n,o
p,q,r,s
7,8,9,10,11
""".lstrip())
new_stderr = io.StringIO()
with contextlib.redirect_stderr(new_stderr):
df = pd.read_csv(bad_data, error_bad_lines=False, warn_bad_lines=True)
n_warned_lines = new_stderr.getvalue().count("Skipping line")
print(n_warned_lines) # 2
添加回答
舉報