首頁猿問如何在無序列表 selenium...

如何在無序列表 selenium + python 中抓取信息

Python

滄海一幻覺 2022-12-20 14:38:22

我正在做一個網(wǎng)絡(luò)抓取項目，我試圖從亞馬遜網(wǎng)站上抓取信息。在網(wǎng)站中，存在包含此類信息的無序列表Item Weight: 17.2 poundsShipping Weight: 17.4 pounds (View shipping rates and policies)ASIN: B00HC767P6UPC: 766789717088 885720483186 052000201628Item model number: mark-1hooi-toop842Customer Reviews: 4.8 out of 5 stars1,352 customer ratingsAmazon Best Sellers Rank: #514 in Grocery & Gourmet Food (See Top 100 in Grocery & Gourmet Food)#12 in Sports Drinks該列表本身沒有任何類別。問題是我不想要列表中的所有信息。只有 ASIN 代碼。li標(biāo)簽沒有任何特定的類或 ID 。這是產(chǎn)品詳細(xì)信息頁面的鏈接在使用 selenium 之前，我曾與 BeautifulSoup 合作，這就是我解決問題的方式asin = str(soup.find('bdi', {'dir': 'ltr'}).find_parent('li'))[38:].split('<')[0]我現(xiàn)在正在切換到硒。我如何抓取信息。

查看完整描述

1 回答

繁華開滿天機(jī)

TA貢獻(xiàn)1816條經(jīng)驗獲得超4個贊

您可以使用 css 選擇器獲取相關(guān)的 li 項，如下所示：

通過css選擇器通過索引查找子元素

$(".content > ul > li:nth-child(2)").textContent >>> "Shipping Weight: 33 pounds (View shipping rates and policies)"

$(".content > ul > li:nth-child(3)").textContent >>> "ASIN: B07QKN2ZT9"

相關(guān)的python selenium代碼：

driver.find_element_by_css_selector(".content > ul > li:nth-child(3)").text.split(": ")[1] >>> 'B07QKN2ZT9'

通過 XPATH 查找祖先元素

如果 ASIN 并不總是在同一個索引中，那么您可以找到bdi具有文本文本的元素ASIN并找到它，ancestor::li然后獲取其文本并提取相關(guān)部分。像下面這樣：

driver.find_element_by_xpath("//bdi[text()='ASIN']/ancestor::li").text.split(": ")[1] >>> 'B07QKN2ZT9'

生成 XPATH

//<element type>[<attribute type> = <attribute value>]/<descendant>

//bdi[text() = 'ASIN'] >>> bdi element with text 'ASIN'

//bdi[@dir = 'ltr'] >>> bdi element with dir attribute equals to 'ltr'

訪問元素的祖先

/ancestor::<ancestor element type>

//bdi[text()='ASIN']/ancestor::li >>> li

//bdi[text()='ASIN']/ancestor::ul >>> ul

你可以檢查這個作為參考

反對回復(fù) 2022-12-20

1 回答
0 關(guān)注
178 瀏覽

關(guān)注

添加回答

舉報

0/150

提交

取消

使用 Ctrl+D 可將網(wǎng)站添加到書簽

微信客服

購課補(bǔ)貼
聯(lián)系客服咨詢優(yōu)惠詳情

幫助反饋 APP下載

慕課網(wǎng)APP
您的移動學(xué)習(xí)伙伴

公眾號

掃描二維碼
關(guān)注慕課網(wǎng)微信公眾號

第七色在线视频,2021少妇久久久久久久久久,亚洲欧洲精品成人久久av18,亚洲国产精品特色大片观看完整版,孙宇晨将参加特朗普的晚宴

熱搜

最近搜索清空

如何在無序列表 selenium + python 中抓取信息

如何在無序列表 selenium + python 中抓取信息

1 回答

添加回答