我有一個包含數(shù)千個文件的文件夾。我正在嘗試使用 beautifulsoup4 解析其中的 XML 標簽。我可以單獨為每個文件執(zhí)行此操作,但無法使用 for 循環(huán)使我的腳本工作。到目前為止,這是我的代碼:import bs4 as bsimport globpath = r"~/Desktop/pythontest/*.txt"files = glob.glob(path)# ------------------------READ AND PARSE TEXT-----------------------------------------for f in files: # open file in read mode source = open(f, "rt") # parse xml as soup soup = bs.BeautifulSoup(source, "lxml") soupText = soup.get_text() text = soupText.replace(r"\n", " ") # close file source.close()# --------------------------OVERWRITE FILE---------------------------------------------for f in files: # open file in write mode source = open(f, "wt") # overwrite the file with the soup source.write((text)) # # close file source.close()print(text)當我運行它時,控制臺給我這個:Traceback (most recent call last): File "./camltest.py", line 34, in <module> print(text)NameError: name 'text' is not defined我懷疑這是范圍問題,但無法修復。有什么建議么?謝謝
2 回答

POPMUISE
TA貢獻1765條經(jīng)驗 獲得超5個贊
您可以在同一循環(huán)中簡單地讀取文件,然后寫入文件。
for f in files:
source = open(f, "w+")
soup = bs.BeautifulSoup(source, "lxml")
soupText = soup.get_text()
text = soupText.replace(r"\n", " ")
source.write(text)
source.close()
添加回答
舉報
0/150
提交
取消