運行結(jié)果只爬了一次,然后就結(jié)束了,之后去掉try塊,報如下的錯誤。
????????
我和你的錯誤一樣,去掉try塊之后,顯示html_parser中的get_text()有錯誤,
Traceback (most recent call last):
? File "G:\eclipse-workspace(JAVAEE)\Python01\baike_spider\spider_main.py", line 41, in <module>
??? obj_spider.craw(root_url)????? #啟動爬蟲
? File "G:\eclipse-workspace(JAVAEE)\Python01\baike_spider\spider_main.py", line 23, in craw
??? new_urls, new_data =self.parser.parse(new_url,html_cont)???
? File "G:\eclipse-workspace(JAVAEE)\Python01\baike_spider\html_parser.py", line 40, in parse
??? new_data = self._get_new_data(page_url,soup)
? File "G:\eclipse-workspace(JAVAEE)\Python01\baike_spider\html_parser.py", line 27, in _get_new_data
??? res_data['title'] =title_node.get_text()
AttributeError: 'NoneType' object has no attribute 'get_text'
2018-12-17
有可能是網(wǎng)頁內(nèi)容沒有下載成功