建議遇到問題的同學(xué)先修改以下三處代碼
listurl = re.findall(r'//.+?\.jpg*', buf) #匹配src中的內(nèi)容
f = open('D:/picture/' + str(i) + '.jpg', 'wb') #將圖片存到D盤下的picture中
req = urllib2.urlopen('http:'+url) #爬取圖片
listurl = re.findall(r'//.+?\.jpg*', buf) #匹配src中的內(nèi)容
f = open('D:/picture/' + str(i) + '.jpg', 'wb') #將圖片存到D盤下的picture中
req = urllib2.urlopen('http:'+url) #爬取圖片
2018-01-18
listurl = re.findall(r'//.+?\.jpg*', buf)
2018-01-18
i = 0
old_url = ''
for _url in listurl:
f = open(str(i)+'.jpg','wb')
url = 'http:'+_url
if url == old_url:
continue
old_url = url
#print (url,'')
req = request.urlopen(url)
buf = req.read()
f.write(buf)
i += 1
f.close()
print ('download %s '%(i))
old_url = ''
for _url in listurl:
f = open(str(i)+'.jpg','wb')
url = 'http:'+_url
if url == old_url:
continue
old_url = url
#print (url,'')
req = request.urlopen(url)
buf = req.read()
f.write(buf)
i += 1
f.close()
print ('download %s '%(i))
2018-01-07
Python3.6版本
from urllib import request
import re
url = 'http://idcbgp.cn/course/list'
req = request.urlopen(url)
buf = req.read()
buf = buf.decode('utf-8')
listurl = re.findall(r'\/\/img.+?\.jpg',buf)
#for _url in listurl:
# print(_url)
from urllib import request
import re
url = 'http://idcbgp.cn/course/list'
req = request.urlopen(url)
buf = req.read()
buf = buf.decode('utf-8')
listurl = re.findall(r'\/\/img.+?\.jpg',buf)
#for _url in listurl:
# print(_url)
2018-01-07
已采納回答 / qq_愛吃羊的鯨魚_0
\1就是代表了前面“([\w]+>)”這些內(nèi)容,你將\1替換掉就成了ma=re.match(r'<([\w]+>)[\w]+</([\w]+>)','<book>python</book>')? 其中括號(hào)已經(jīng)沒有意義,去掉后就變成ma=re.match(r'<[\w]+>[\w]+</[\w]+>','<book>python</book>')? 這樣看就應(yīng)該沒問題了吧。后面加1匹配不出來的原因也是應(yīng)為&...
2017-12-25
最贊回答 / 華燈初上丶
import reimport urllibreq = urllib.request.urlopen('http://idcbgp.cn/course/list')#此處加上decode(),不然拿下來的數(shù)據(jù)都是亂碼buf = req.read().decode("utf-8")#老師講課的url地址已經(jīng)發(fā)生改變,改一下正則匹配就好# listurl = re.findall(r'src=.+\.jpg', buf)listurl = re.findall(r'//img.+?\.jpg', bu...
2017-12-11