第七色在线视频,2021少妇久久久久久久久久,亚洲欧洲精品成人久久av18,亚洲国产精品特色大片观看完整版,孙宇晨将参加特朗普的晚宴

為了賬號(hào)安全,請(qǐng)及時(shí)綁定郵箱和手機(jī)立即綁定
已解決430363個(gè)問(wèn)題,去搜搜看,總會(huì)有你想問(wèn)的

BeautifulSoup 找不到所有 div 標(biāo)簽

BeautifulSoup 找不到所有 div 標(biāo)簽

慕桂英4014372 2023-10-30 19:47:12
我已經(jīng)開(kāi)始了一個(gè)私人項(xiàng)目:在 Visual Studio Code (1.41.0) 中使用 Python 和 BeautifulSoup 進(jìn)行網(wǎng)頁(yè)抓取。我能夠抓取與我的&ldquo;問(wèn)題網(wǎng)站&rdquo;具有相同結(jié)構(gòu)的另一個(gè)網(wǎng)站。然而現(xiàn)在我遇到了,BeautifulSoup 沒(méi)有找到所有 div 標(biāo)簽(每個(gè)站點(diǎn)應(yīng)該有 20 個(gè),而我只找到了其中 3 個(gè))。<div class="css-15dj4ut"></div>我從 中得到了所有<div class="css-fh99y9 excbu0j0">...</div>,但沒(méi)有從 中得到<div class="css-roynbj excbu0j0"></div>。你知道為什么嗎?迭代每個(gè) url 以訪問(wèn)每個(gè)站點(diǎn)。for i in range(0, endIndex):try:? ? if i == 0:? ? ? ? urls.append(basicUrl)? ? ? ? page = urllib.request.urlopen(urls[i])? ? ? ? soup = BeautifulSoup(page, 'html.parser')? ? ? ? getSurgeonName(soup)? ? else:? ? ? ? urls.append(basicUrl + urlAddon + str(i + 1))? ? ? ? page = urllib.request.urlopen(urls[i])? ? ? ? soup = BeautifulSoup(page, 'html.parser')? ? ? ? getSurgeonName(soup)except:? ? print("An URL request error occured.")函數(shù)版本1:def getSurgeonName(soup):? ? # gets just first 3 surgeons of site? ? docName = re.compile('css-15dj4ut')? ? docNameTags = soup.find_all('div', attrs={'class': docName})? ? for a in docNameTags:? ? ? ? ? ? docNameList.append(a.getText())功能版本2:def getSurgeonName(soup):? ? parentClass = re.compile('css-fh99y9 excbu0j0')? ? parentItems = soup.find_all('div', attrs={'class': parentClass})? ? for parent in parentItems:? ? ? ? ? ?children = parent.findChildren('div', {"class": "css-15dj4ut"})?? ? ? ? ? ?docNameList.append(children[0].getText())? ? parentClass = re.compile('css-roynbj excbu0j0')? ? parentItems = soup.find_all('div', attrs={'class': parentClass})? ? for parent in parentItems:? ? ? ? ? ?children = parent.findChildren('div', {'class': 'css-15dj4ut'})?? ? ? ? ? ?docNameList.append(children[0].getText())
查看完整描述

1 回答

?
大話西游666

TA貢獻(xiàn)1817條經(jīng)驗(yàn) 獲得超14個(gè)贊

實(shí)際上,您所需的desired數(shù)據(jù)是通過(guò)JavaScript頁(yè)面加載動(dòng)態(tài)加載的,因此requests包將無(wú)法JavaScript動(dòng)態(tài)渲染。但我已經(jīng)能夠找到script保存數(shù)據(jù)的標(biāo)簽,然后將其加載到string中。JSON dictJSON


在這里你可以解析任何你想要的:)。


import requests

from bs4 import BeautifulSoup

import json


r = requests.get("https://www.comparis.ch/gesundheit/arzt/pathologie")

soup = BeautifulSoup(r.content, 'html.parser')

script = soup.find("script", {'id': '__NEXT_DATA__'}).text


data = json.loads(script)


print(data.keys())  # JSON Dict


dumper = json.dumps(data, indent=4)


print(dumper)  # to see it in human readble format

就像是:


for item in data['props']['pageProps']['doctorResults']['doctorModels']:

    print(item['name'])

輸出:


Mohamed Abdou

Dr. med. Heiner Adams

Dr. med. Franziska Aebersold

Prof. Dr. med. Adriano Aguzzi

Dr. med. Maria Ammann

Prosper Anani

Dr. med. Max Arnaboldi

Dr. med. Walter Arnold

Dr. med. Irena Baltisser

Dr. med. Fridolin Bannwart

Dr. med. Yara Banz

Dr. med. André Barghorn

Dr. Jessica Barizzi

Prof. Dr. med. Daniel Baumhoer

Audrey Baur Chaubert

Dr. med. Christian Georg Bayerl

Dr. med. Marc Beer

Dr. med. Sabina Berezowska

Dr. med. Steffen Bergelt

Dr. med. Barbara Elisabeth Berger-Denzler


查看完整回答
反對(duì) 回復(fù) 2023-10-30
  • 1 回答
  • 0 關(guān)注
  • 196 瀏覽

添加回答

舉報(bào)

0/150
提交
取消
微信客服

購(gòu)課補(bǔ)貼
聯(lián)系客服咨詢優(yōu)惠詳情

幫助反饋 APP下載

慕課網(wǎng)APP
您的移動(dòng)學(xué)習(xí)伙伴

公眾號(hào)

掃描二維碼
關(guān)注慕課網(wǎng)微信公眾號(hào)