第七色在线视频,2021少妇久久久久久久久久,亚洲欧洲精品成人久久av18,亚洲国产精品特色大片观看完整版,孙宇晨将参加特朗普的晚宴

為了賬號(hào)安全,請(qǐng)及時(shí)綁定郵箱和手機(jī)立即綁定
已解決430363個(gè)問(wèn)題,去搜搜看,總會(huì)有你想問(wèn)的

無(wú)法在Python中的Beautiful Soup中獲取div標(biāo)簽,

無(wú)法在Python中的Beautiful Soup中獲取div標(biāo)簽,

翻閱古今 2023-12-29 17:02:24
我正在嘗試下載官方網(wǎng)站上提供的所有口袋妖怪圖像。我這樣做的原因是因?yàn)槲蚁胍哔|(zhì)量的圖像。以下是我編寫的代碼。from bs4 import BeautifulSoup as bs4import requestsrequest = requests.get('https://www.pokemon.com/us/pokedex/')soup = bs4(request.text, 'html')print(soup.findAll('div',{'class':'container       pokedex'}))輸出是[]我做錯(cuò)了什么嗎?另外,從官方網(wǎng)站抓取合法嗎?有沒(méi)有任何標(biāo)簽或東西可以說(shuō)明這一點(diǎn)?謝謝PS:我是 BS 和 html 的新手。
查看完整描述

2 回答

?
嚕嚕噠

TA貢獻(xiàn)1784條經(jīng)驗(yàn) 獲得超7個(gè)贊

圖像是動(dòng)態(tài)加載的,因此您必須使用selenium它們來(lái)抓取它們。這是執(zhí)行此操作的完整代碼:


from selenium import webdriver

import time

import requests


driver = webdriver.Chrome()


driver.get('https://www.pokemon.com/us/pokedex/')


time.sleep(4)


li_tags = driver.find_elements_by_class_name('animating')[:-3]


li_num = 1


for li in li_tags:

    img_link = li.find_element_by_xpath('.//img').get_attribute('src')

    name = li.find_element_by_xpath(f'/html/body/div[4]/section[5]/ul/li[{li_num}]/div/h5').text


    r = requests.get(img_link)

    

    with open(f"D:\\{name}.png", "wb") as f:

        f.write(r.content)


    li_num += 1


driver.close()

輸出:


12張口袋妖怪圖片。這是前兩張圖片:


圖片1:

https://img1.sycdn.imooc.com/658e8c5a0001006c02140216.jpg

圖片2:

https://img1.sycdn.imooc.com/658e8c630001f21702170208.jpg

另外,我注意到頁(yè)面底部有一個(gè)加載更多按鈕。單擊時(shí),它會(huì)加載更多圖像。單擊“加載更多”按鈕后,我們必須繼續(xù)向下滾動(dòng)才能加載更多圖像。如果我沒(méi)記錯(cuò)的話,網(wǎng)站上一共有 893 張圖片。為了抓取所有 893 張圖像,您可以使用以下代碼:


from selenium import webdriver

import time

import requests


driver = webdriver.Chrome()


driver.get('https://www.pokemon.com/us/pokedex/')


time.sleep(3)


load_more = driver.find_element_by_xpath('//*[@id="loadMore"]')


driver.execute_script("arguments[0].click();",load_more)


lenOfPage = driver.execute_script("window.scrollTo(0, document.body.scrollHeight);var lenOfPage=document.body.scrollHeight;return lenOfPage;")

match=False

while(match==False):

        lastCount = lenOfPage

        time.sleep(1.5)

        lenOfPage = driver.execute_script("window.scrollTo(0, document.body.scrollHeight);var lenOfPage=document.body.scrollHeight;return lenOfPage;")

        if lastCount==lenOfPage:

            match=True


li_tags = driver.find_elements_by_class_name('animating')[:-3]


li_num = 1


for li in li_tags:

    img_link = li.find_element_by_xpath('.//img').get_attribute('src')

    name = li.find_element_by_xpath(f'/html/body/div[4]/section[5]/ul/li[{li_num}]/div/h5').text


    r = requests.get(img_link)

    

    with open(f"D:\\{name}.png", "wb") as f:

        f.write(r.content)


    li_num += 1


driver.close()


查看完整回答
反對(duì) 回復(fù) 2023-12-29
?
元芳怎么了

TA貢獻(xiàn)1798條經(jīng)驗(yàn) 獲得超7個(gè)贊

如果您首先檢查網(wǎng)絡(luò)選項(xiàng)卡,這可能會(huì)更容易完成:


import time

import requests



endpoint = "https://www.pokemon.com/us/api/pokedex/kalos"

# contains all metadata

data = requests.get(endpoint).json()


# collect keys needed to save the picture

items = [{"name": item["name"], "link": item["ThumbnailImage"]} for item in data]


# remove duplicates

d = [dict(t) for t in {tuple(d.items()) for d in items}]

assert len(d) == 893



for pokemon in d:

    response = requests.get(pokemon["link"])

    time.sleep(1)

    with open(f"{pokemon['name']}.png", "wb") as f:

        f.write(response.content)


查看完整回答
反對(duì) 回復(fù) 2023-12-29
  • 2 回答
  • 0 關(guān)注
  • 228 瀏覽
慕課專欄
更多

添加回答

舉報(bào)

0/150
提交
取消
微信客服

購(gòu)課補(bǔ)貼
聯(lián)系客服咨詢優(yōu)惠詳情

幫助反饋 APP下載

慕課網(wǎng)APP
您的移動(dòng)學(xué)習(xí)伙伴

公眾號(hào)

掃描二維碼
關(guān)注慕課網(wǎng)微信公眾號(hào)