課程
/后端開(kāi)發(fā)
/Python
/Python開(kāi)發(fā)簡(jiǎn)單爬蟲(chóng)
為什么爬取的內(nèi)容是字節(jié)碼的格式?
2016-03-16
源自:Python開(kāi)發(fā)簡(jiǎn)單爬蟲(chóng) 7-5
正在回答
class HtmlOutputer(object):??? def __init__(self):??????? self.datas=[]??? ??? def collect_data(self,data):??????? if data is None:??????????? return??????? self.datas.append(data)??????? ??? def output_html(self):??????? fout = open('output.html', 'w', encoding='utf-8')??????? fout.write("<html>")??????? fout.write("<head>")??????? fout.write('<meta charset="UTF-8">')??????? fout.write("</head>")??????? fout.write("<body>")??????? fout.write("<table>")??????? ??????? #ascii??????? for data in self.datas:??????????? fout.write("<tr>")??????????? fout.write("<td>%s</td>"%data['url'])??????????? fout.write("<td>%s</td>"%data['title'])??????????? fout.write("<td>%s</td>"%data['summary'])??????? fout.write("</table>")??????? fout.write("</body>")??????? fout.write("</html>")
Python2.X 默認(rèn)編碼方式為 ascii
python3.4.4不存在編碼問(wèn)題
這個(gè)貌似是python2 默認(rèn)的asc碼,python3 改成unicode了。
舉報(bào)
本教程帶您解開(kāi)python爬蟲(chóng)這門神奇技術(shù)的面紗
Copyright ? 2025 imooc.com All Rights Reserved | 京ICP備12003892號(hào)-11 京公網(wǎng)安備11010802030151號(hào)
購(gòu)課補(bǔ)貼聯(lián)系客服咨詢優(yōu)惠詳情
慕課網(wǎng)APP您的移動(dòng)學(xué)習(xí)伙伴
掃描二維碼關(guān)注慕課網(wǎng)微信公眾號(hào)
2018-07-27
class HtmlOutputer(object):
??? def __init__(self):
??????? self.datas=[]
???
??? def collect_data(self,data):
??????? if data is None:
??????????? return
??????? self.datas.append(data)
???????
??? def output_html(self):
??????? fout = open('output.html', 'w', encoding='utf-8')
??????? fout.write("<html>")
??????? fout.write("<head>")
??????? fout.write('<meta charset="UTF-8">')
??????? fout.write("</head>")
??????? fout.write("<body>")
??????? fout.write("<table>")
???????
??????? #ascii
??????? for data in self.datas:
??????????? fout.write("<tr>")
??????????? fout.write("<td>%s</td>"%data['url'])
??????????? fout.write("<td>%s</td>"%data['title'])
??????????? fout.write("<td>%s</td>"%data['summary'])
??????? fout.write("</table>")
??????? fout.write("</body>")
??????? fout.write("</html>")
2016-04-24
Python2.X 默認(rèn)編碼方式為 ascii
2016-04-06
python3.4.4不存在編碼問(wèn)題
2016-03-19
這個(gè)貌似是python2 默認(rèn)的asc碼,python3 改成unicode了。