爬取失敗,Spider_main和outputer模塊出現(xiàn)問題
問題如下
craw 1 :https://baike.baidu.com/item/Python/407313?fr=aladdin
Traceback (most recent call last):
? File "C:\Users\Administrator\workspace\python_spider\spider_main.py", line 48, in <module>
? ? obj_spider.craw(root_url)
? File "C:\Users\Administrator\workspace\python_spider\spider_main.py", line 37, in craw
? ? self.outputer.output_html()
? File "C:\Users\Administrator\workspace\python_spider\html_outputer.py", line 28, in output_html
? ? fout.write("<td>%s</td>"% data['summary'])
UnicodeEncodeError: 'gbk' codec can't encode character u'\xa0' in position 14: illegal multibyte sequence