首頁猿問 python如何正確抓取網(wǎng)頁標題

python如何正確抓取網(wǎng)頁標題

JavaScript

慕桂英546537 2019-04-06 08:31:19

通過urllib將網(wǎng)頁內(nèi)容抓取下來，然后用正則表達式re模塊將標題匹配出來，但是發(fā)現(xiàn)部分標題會出現(xiàn)問題，比如下面抓Apple的代碼運行結果是App，測試發(fā)現(xiàn)匹配結果m是沒有問題的，問題出現(xiàn)在了strip()這里。#-*-coding:utf-8-*-importurllibimportreurl='http://apple.com'html=urllib.urlopen(url).read()#printhtmlm=re.search(".*",html)printm.group()#這里輸出結果Appleprintm.group().strip("")#問題應該出現(xiàn)在這個正則

查看完整描述