2 回答

TA貢獻1784條經(jīng)驗 獲得超7個贊
圖像是動態(tài)加載的,因此您必須使用selenium它們來抓取它們。這是執(zhí)行此操作的完整代碼:
from selenium import webdriver
import time
import requests
driver = webdriver.Chrome()
driver.get('https://www.pokemon.com/us/pokedex/')
time.sleep(4)
li_tags = driver.find_elements_by_class_name('animating')[:-3]
li_num = 1
for li in li_tags:
img_link = li.find_element_by_xpath('.//img').get_attribute('src')
name = li.find_element_by_xpath(f'/html/body/div[4]/section[5]/ul/li[{li_num}]/div/h5').text
r = requests.get(img_link)
with open(f"D:\\{name}.png", "wb") as f:
f.write(r.content)
li_num += 1
driver.close()
輸出:
12張口袋妖怪圖片。這是前兩張圖片:
圖片1:
圖片2:
另外,我注意到頁面底部有一個加載更多按鈕。單擊時,它會加載更多圖像。單擊“加載更多”按鈕后,我們必須繼續(xù)向下滾動才能加載更多圖像。如果我沒記錯的話,網(wǎng)站上一共有 893 張圖片。為了抓取所有 893 張圖像,您可以使用以下代碼:
from selenium import webdriver
import time
import requests
driver = webdriver.Chrome()
driver.get('https://www.pokemon.com/us/pokedex/')
time.sleep(3)
load_more = driver.find_element_by_xpath('//*[@id="loadMore"]')
driver.execute_script("arguments[0].click();",load_more)
lenOfPage = driver.execute_script("window.scrollTo(0, document.body.scrollHeight);var lenOfPage=document.body.scrollHeight;return lenOfPage;")
match=False
while(match==False):
lastCount = lenOfPage
time.sleep(1.5)
lenOfPage = driver.execute_script("window.scrollTo(0, document.body.scrollHeight);var lenOfPage=document.body.scrollHeight;return lenOfPage;")
if lastCount==lenOfPage:
match=True
li_tags = driver.find_elements_by_class_name('animating')[:-3]
li_num = 1
for li in li_tags:
img_link = li.find_element_by_xpath('.//img').get_attribute('src')
name = li.find_element_by_xpath(f'/html/body/div[4]/section[5]/ul/li[{li_num}]/div/h5').text
r = requests.get(img_link)
with open(f"D:\\{name}.png", "wb") as f:
f.write(r.content)
li_num += 1
driver.close()

TA貢獻1798條經(jīng)驗 獲得超7個贊
如果您首先檢查網(wǎng)絡選項卡,這可能會更容易完成:
import time
import requests
endpoint = "https://www.pokemon.com/us/api/pokedex/kalos"
# contains all metadata
data = requests.get(endpoint).json()
# collect keys needed to save the picture
items = [{"name": item["name"], "link": item["ThumbnailImage"]} for item in data]
# remove duplicates
d = [dict(t) for t in {tuple(d.items()) for d in items}]
assert len(d) == 893
for pokemon in d:
response = requests.get(pokemon["link"])
time.sleep(1)
with open(f"{pokemon['name']}.png", "wb") as f:
f.write(response.content)
添加回答
舉報