1 回答

TA貢獻(xiàn)1906條經(jīng)驗(yàn) 獲得超10個(gè)贊
使用BeautifulSoup:
from bs4 import BeautifulSoup
import urllib
test = '''<div>text_0<ul>
<li>text_1</li>
<li>text_2</li>
<li>text_3</li>
</ul>
</div>'''
soup = BeautifulSoup(test, 'html.parser')
data = soup.find_all("div")
for d in data:
print(d.text)
輸出:
text_0
text_1
text_2
text_3
使用xpath:
import lxml.html as LH
content = '''<div>text_0<ul>
<li>text_1</li>
<li>text_2</li>
<li>text_3</li>
</ul>
</div>'''
root = LH.fromstring(content)
for elem in root.xpath('//div/descendant::text()'):
print(elem)
添加回答
舉報(bào)