首頁猿問使用 Python 中的...

使用 Python 中的 BeautifulSoup 從 Google 搜索中檢索鏈接

Python

翻閱古今 2023-10-26 15:33:36

我正在使用 Tweepy 和 BeautifulSoup4 構(gòu)建 Twitter 機(jī)器人。我想將請求的結(jié)果保存在列表中，但我的腳本不再工作（但幾天前就可以工作）。我一直在看，但我不明白。這是我的功能：import requestsimport tweepyfrom bs4 import BeautifulSoupimport urllibimport osfrom tweepy import StreamListenerfrom TwitterEngine import TwitterEnginefrom ConfigEngine import TwitterAPIConfigimport urllib.requestimport emojiimport random# desktop user-agentUSER_AGENT = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:65.0) Gecko/20100101 Firefox/65.0"# mobile user-agentMOBILE_USER_AGENT = "Mozilla/5.0 (Linux; Android 7.0; SM-G930V Build/NRD90M) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.125 Mobile Safari/537.36"# Récupération des liensdef parseLinks(url): headers = {"user-agent": USER_AGENT} resp = requests.get(url, headers=headers) if resp.status_code == 200: soup = BeautifulSoup(resp.content, "html.parser") results = [] for g in soup.find_all('div', class_='r'): anchors = g.find_all('a') if anchors: link = anchors[0]['href'] results.append(link) return results代碼其余部分中的“url”參數(shù) 100% 正確。作為輸出，我得到“無”。更準(zhǔn)確地說，執(zhí)行在“results = []”行之后立即停止（因此它不會進(jìn)入 for）。任何想法？提前非常感謝！

查看完整描述

1 回答

夢里花落0921

TA貢獻(xiàn)1772條經(jīng)驗(yàn) 獲得超6個(gè)贊

Google 似乎更改了頁面上的 HTML 標(biāo)記。嘗試將搜索從更改class="r"為class="rc"：

import requests

from bs4 import BeautifulSoup

USER_AGENT = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:65.0) Gecko/20100101 Firefox/65.0"

def parseLinks(url):

headers = {"user-agent": USER_AGENT}

resp = requests.get(url, headers=headers)

if resp.status_code == 200:

soup = BeautifulSoup(resp.content, "html.parser")

results = []

for g in soup.find_all('div', class_='rc'): # <-- change 'r' to 'rc'

anchors = g.find_all('a')

if anchors:

link = anchors[0]['href']

results.append(link)

return results

url = 'https://www.google.com/search?q=tree'

print(parseLinks(url))

印刷：

['https://en.wikipedia.org/wiki/Tree', 'https://simple.wikipedia.org/wiki/Tree', 'https://www.britannica.com/plant/tree', 'https://www.treepeople.org/tree-benefits', 'https://books.google.sk/books?id=yNGrqIaaYvgC&pg=PA20&lpg=PA20&dq=tree&source=bl&ots=_TP8PqSDlT&sig=ACfU3U16j9xRJgr31RraX0HlQZ0ryv9rcA&hl=sk&sa=X&ved=2ahUKEwjOq8fXyKjsAhXhAWMBHToMDw4Q6AEwG3oECAcQAg', 'https://teamtrees.org/', 'https://www.woodlandtrust.org.uk/trees-woods-and-wildlife/british-trees/a-z-of-british-trees/', 'https://artsandculture.google.com/entity/tree/m07j7r?categoryId=other']

反對回復(fù) 2023-10-26