1 回答

TA貢獻(xiàn)1851條經(jīng)驗(yàn) 獲得超5個贊
嘗試這個 :
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
import time
driver=webdriver.Chrome("chromedriver.exe")
driver.get("https://www.google.com/")
print(driver.title)
driver.maximize_window()
time.sleep(2)
driver.find_element(By.XPATH, "//input[@name='q']").send_keys('selenium')
driver.find_element(By.XPATH, "//div[@class='FPdoLc tfB0Bf']//input[@name='btnK']").send_keys(Keys.ENTER)
a = driver.find_elements_by_xpath("//div[@class='r']/a")
links = []
for x in a: # this loop get all the webpages link and store into 'links' list.
links.append(x.get_attribute('href'))
link_data = []
for new_url in links: #go on every webpage and store page source in link_data list.
print('new url : ' , new_url)
driver.get(new_url)
link_data.append(driver.page_source)
driver.back()
#print('link data len : ' ,len(link_data))
#print('link data [0] : ' , link_data[0]) # print first webpage source.
此代碼從所有鏈接獲取所有數(shù)據(jù)并保存在link_data列表中。
對于 p 標(biāo)簽,您可以使用以下代碼:
from bs4 import BeautifulSoup as bs
page = bs(link_data[0],'html.parser')
p_tag = page.find_all('p')
print(p_tag)
添加回答
舉報(bào)