我正在使用此 JavaScript 來驗證表單:<script type="text/javascript"> function validateForm() { var a=document.forms["orderform"]["Name"].value; var b=document.forms["orderform"]["Street"].value; var c=document.forms["orderform"]["ZIP"].value; var d=document.forms["orderform"]["City"].value; var e=document.forms["orderform"]["PhoneNumber"].value; if ( a==null || a=="" || b==null || b=="" || c==null || c=="" || d==null || d=="" || e==null || e=="" ) {alert("Please fill all the required fields."); return false; } } </script>我正在嘗試使用 BeatifulSoup 捕獲警報文本:import refrom bs4 import BeautifulSoupwith open("index.html") as fp: soup = BeautifulSoup(fp, "lxml")for script in soup.find_all(re.compile("(?<=alert\(\").+(?=\")")): print(script)這不會返回任何東西。這是基于 BS 文檔中“正則表達式”下給出的示例,用于查找以“b”開頭的標簽名稱:import refor tag in soup.find_all(re.compile("^b")): print(tag.name)# body# b但我似乎無法找到相當于打印警報文本的 'print(tag.name)' 。還是我完全走錯了路?任何幫助深表感謝。編輯:我試過:pattern = re.compile("(?<=alert\(\").+(?=\")"))for script in soup.find_all ('script'): print(script.pattern)這將返回“無”。
1 回答

慕森卡
TA貢獻1806條經(jīng)驗 獲得超8個贊
運行所有html數(shù)據(jù)將不起作用。首先,您需要提取script數(shù)據(jù),然后才能輕松解析alert文本。
import re
from bs4 import BeautifulSoup
with open("index.html") as fp:
soup = BeautifulSoup(fp, "lxml")
script = soup.find("script").extract()
# find all alert text
alert = re.findall(r'(?<=alert\(\").+(?=\")', script.text)
print(alert)
輸出:
['Please fill all the required fields.']
添加回答
舉報
0/150
提交
取消