首頁猿問我可以使用BeautifulSou...

我可以使用BeautifulSoup刪除腳本標(biāo)簽嗎？

Html/CSS

尚方寶劍之說 2019-12-25 11:04:12

可以使用BeautifulSoup從HTML中刪除腳本標(biāo)簽及其所有內(nèi)容，還是必須使用正則表達(dá)式或其他內(nèi)容？

查看完整描述

3 回答

POPMUISE

TA貢獻(xiàn)1765條經(jīng)驗(yàn) 獲得超5個(gè)贊

>>> from bs4 import BeautifulSoup

>>> soup = BeautifulSoup('<script>a</script>baba<script>b</script>', 'lxml')

>>> [s.extract() for s in soup('script')]

>>> soup

baba

反對(duì) 回復(fù) 2019-12-25

慕哥9229398

TA貢獻(xiàn)1877條經(jīng)驗(yàn) 獲得超6個(gè)贊

為可能需要將來參考的人員更新了答案：正確答案是。 decompose() 您可以使用不同的方式，但是decompose可以在原地工作。

用法示例：

soup = BeautifulSoup('<p>This is a slimy text and <i> I am slimer</i></p>')

soup.i.decompose()

print str(soup)

#prints '<p>This is a slimy text and</p>'

消除諸如“ script”，“ img”之類的碎屑非常有用。

反對(duì) 回復(fù) 2019-12-25

素胚勾勒不出你

TA貢獻(xiàn)1827條經(jīng)驗(yàn) 獲得超9個(gè)贊

如（官方文檔）中所述，您可以使用extract方法刪除與搜索匹配的所有子樹。

import BeautifulSoup

a = BeautifulSoup.BeautifulSoup("<html><body><script>aaa</script></body></html>")

[x.extract() for x in a.findAll('script')]

反對(duì) 回復(fù) 2019-12-25

關(guān)注