2 回答

TA貢獻(xiàn)2041條經(jīng)驗(yàn) 獲得超4個(gè)贊
顯然,.text腳本標(biāo)簽始終為空字符串。但是,您可以從以下位置獲取標(biāo)簽的內(nèi)容.children
from bs4 import BeautifulSoup
from io import StringIO
html = """
<script>
let a = "Hello";
</script>
"""
b = StringIO(html)
soup = BeautifulSoup(b, 'lxml')
for e in soup.find_all('script'):
print(repr(e.text))
print(repr(''.join(e.children)))

TA貢獻(xiàn)1943條經(jīng)驗(yàn) 獲得超7個(gè)贊
您可以使用以下方式.string訪問<script>字符串:
import re
import json
from bs4 import BeautifulSoup
html_doc = '''<script>
var teamsData = JSON.parse('\x7B\x2271\x22\x3A\x7B\x22id\x22\x3A\x2271\x22,\x22title\x22\x3A\x22Aston\x20Villa\x22,\x22history\x22\x3A\x5B\x5D\x7D,\x2272\x22\x3A\x7B\x22id\x22\x3A\x2272\x22\x7D\x7D');
</script>'''
soup = BeautifulSoup(html_doc, 'html.parser')
script_string = soup.find('script').string
print(script_string)
印刷:
var teamsData = JSON.parse('{"71":{"id":"71","title":"Aston Villa","history":[]},"72":{"id":"72"}}');
要解析JSON數(shù)據(jù),可以使用re/ jsonmodules。例如:
data = re.search(r"JSON\.parse\('(.*?)'\);", script_string).group(1)
data = json.loads(data)
for k, v in data.items():
print(k, v)
印刷:
71 {'id': '71', 'title': 'Aston Villa', 'history': []}
72 {'id': '72'}
添加回答
舉報(bào)