3 回答

TA貢獻(xiàn)1998條經(jīng)驗(yàn) 獲得超6個(gè)贊
您可以添加標(biāo)題。然后當(dāng)你這樣做時(shí)find_all('a'),你可以得到它的href:
import requests
from bs4 import BeautifulSoup
link = "https://www.amazon.in/Power-Banks/b/ref=nav_shopall_sbc_mobcomp_powerbank?ie=UTF8&node=6612025031"
def amazon(url):
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36'}
sourcecode = requests.get(url, headers=headers)
sourcecode_text = sourcecode.text
soup = BeautifulSoup(sourcecode_text, 'html.parser')
for link in soup.find_all('a', href=True):
href = link.get('href')
print(href)
amazon(link)

TA貢獻(xiàn)2019條經(jīng)驗(yàn) 獲得超9個(gè)贊
您的代碼中的問題是您使用了錯(cuò)誤的方法名稱 findALL .. 湯對(duì)象中沒有 findALL 方法,因此沒有返回任何方法。要修復(fù)新代碼使用 find_all 的問題,findAll 也應(yīng)該工作(小寫雙 l)。希望這對(duì)你來說清楚。
import requests
from bs4 import BeautifulSoup
link = "https://www.amazon.in/Power-Banks/b/ref=nav_shopall_sbc_mobcomp_powerbank?ie=UTF8&node=6612025031"
def amazon(url):
sourcecode = requests.get(url)
sourcecode_text = sourcecode.text
soup = BeautifulSoup(sourcecode_text, "html.parser")
# add "html.parser" as second arg , so you not get a warning .
# use soup.find_all for new code , also soup.findAll should work
for link in soup.find_all('a', {'class': 'a-link-normal aok-block a-text-normal'}):
href = link.get('href')
print(href)
amazon(link)

TA貢獻(xiàn)1111條經(jīng)驗(yàn) 獲得超0個(gè)贊
如果你現(xiàn)在試圖刮亞馬遜,requests你將不會(huì)得到任何回報(bào),因?yàn)閬嗰R遜會(huì)知道這是一個(gè)腳本,而標(biāo)頭也無濟(jì)于事(據(jù)我所知)。
相反,作為回應(yīng),他們會(huì)告訴以下內(nèi)容:
To discuss automated access to Amazon data please contact api-services-support@amazon.com.
您可以使用requests-html或selenium通過渲染來抓取亞馬遜。
Requeests-html 抓取標(biāo)題的簡(jiǎn)單示例(如果您在隱身選項(xiàng)卡中打開相同的鏈接,結(jié)果將類似):
from requests_html import HTMLSession
session = HTMLSession()
url = 'https://www.amazon.com/s?k=apple+watch+series+6+band'
r = session.get(url)
r.html.render(sleep=1, keep_page=True, scrolldown = 1)
for container in r.html.find('.a-size-medium'):
title = container.text
print(f"Title: {title}")
輸出:
Title: New Apple Watch Series 6 (GPS, 40mm) - (Product) RED - Aluminum Case with (Product) RED - Sport Band
Title: SUPCASE [Unicorn Beetle Pro] Designed for Apple Watch Series 6/SE/5/4 [44mm], Rugged Protective Case with Strap Bands(Black)
Title: Spigen Rugged Armor Pro Designed for Apple Watch Band with Case for 44mm Series 6/SE/5/4 - Charcoal Gray
Title: Highly rated and well-priced products
Title: Fitlink Stainless Steel Metal Band for Apple Watch 38/40/42/44mm Replacement Link Bracelet Band Compatible with Apple Watch Series 6 Apple Watch Series 5 Apple Watch Series 1/2/3/4 (Grey,42/44mm)
Title: TalkWorks Compatible for Apple Watch Band 42mm / 44mm Comfort Fit Mesh Loop Stainless Steel Adjustable Magnetic Strap for iWatch Series 6, 5, 4, 3, 2, 1, SE - Rose Gold
Title: COOYA Compatible for Apple Watch Band 44mm 42mm Women Men iWatch Wristband with Protective Rugged Case Sport Strap Adjustable Replacement Band Compatible with Apple Watch Series 6 SE 5 4 3 2, Clear
Title: Stainless Steel Metal Bands Compatible with Apple Watch Band 42mm 44mm, Gold Replacement Strap with Adapter+Case Cover Compatible with iWatch Series 6 5 4 3 2 1 SE Sport
Title: elago W2 Charger Stand Compatible with Apple Watch Series 6/SE/5/4/3/2/1 (44mm, 42mm, 40mm, 38mm), Durable Silicone, Compatible with Nightstand Mode (Black)
Title: Element Case Black Ops Watch Band for Apple Watch Series 4/5/6/SE, 44mm - Black (EMT-522-244A-01)
...
添加回答
舉報(bào)