defgetHtml(url,timeout=20):try:headers={'Accept-Language':'zh-cn','Content-Type':'application/x-www-form-urlencoded','User-Agent':'Mozilla/4.0(compatibleMSIE6.00WindowsNT5.1SV1)',}r=requests.get(url,headers=headers,timeout=timeout)html=r.textreturnhtmlexceptException,ex:returnNonesoup=BeautifulSoup(getHtml())printsoup.title以上代碼,如何改進,才能在獲取任何網(wǎng)頁標題的時候,不至于亂碼。注:提取部分網(wǎng)頁的標題的時候會直接亂碼顯示。如何改進,才能通用?
python:requests獲取網(wǎng)頁源碼的時候亂碼
皈依舞
2019-04-16 20:27:51