首頁(yè) 猿問(wèn) python爬蟲輸出的時(shí)候輸出內(nèi)容...

python爬蟲輸出的時(shí)候輸出內(nèi)容如何去掉前五個(gè)內(nèi)容。

Python

格瑞克愛鮮荔枝 2017-04-04 11:36:44

from urllib import requestimport urllibfrom bs4 import BeautifulSoupimport xlwtimport rebook=xlwt.Workbook(encoding='utf-8',style_compression=0)sheet1=book.add_sheet('mymovie',cell_overwrite_ok=True)url='http://www.meijuworld.com/category/uk'?req = request.Request(url)req.add_header('user-agentkk','Mozilla/5.0')response = request.urlopen(req)html_doc = response.read()wholepage = BeautifulSoup(html_doc,'html.parser',from_encoding='UTF-8')meiju = wholepage.find_all('div',class_='an-widget-title')ds = re.findall('http://www.meijuworld.com/.*.html',str(meiju))for i in ds:? ? print(i)print('ok')#輸出的結(jié)果是把美劇網(wǎng)頁(yè)面的另外五個(gè)鏈接也抓到了，我不想要這前面五個(gè)鏈接，怎么去除，輸出的時(shí)候只要后面的12個(gè)鏈接

查看完整描述