首頁猿問使用 selenium...

使用 selenium 下載“401 Unauthorized”視頻

Python

慕哥9229398 2023-10-05 16:52:57

我正在嘗試創(chuàng)建一個機器人，它將使用 selenium 和 python3 從名為“Sdarot”的網(wǎng)站下載視頻。網(wǎng)站中的每個視頻（或劇集）都有一個唯一的頁面和 URL。加載劇集時，您必須等待 30 秒才能“加載”該劇集，然后 <video> 標記才會出現(xiàn)在 HTML 源文件中。問題在于，對視頻的請求是以一種或另一種方式加密或保護的（我真的不明白它是如何工作的）！當我嘗試簡單地等待視頻標簽出現(xiàn)，然后使用 urllib 庫下載視頻（參見下面的代碼）時，出現(xiàn)以下錯誤：urllib.error.HTTPError: HTTP Error 401: Unauthorized我應該注意到，當我嘗試打開 selenium 驅(qū)動程序中下載視頻的鏈接時，它打開得完全正常，我可以手動下載它。如何自動下載視頻？提前致謝！代碼：from selenium import webdriverfrom selenium.webdriver.common.by import Byfrom selenium.webdriver.support.ui import WebDriverWaitfrom selenium.webdriver.support import expected_conditions as ECimport urllib.requestdef load(driver, url): driver.get(url) # open the page in the browser try: # wait for the episode to "load" # if something is wrong and the episode doesn't load after 45 seconds, # the function will call itself again and try to load again. continue_btn = WebDriverWait(driver, 45).until( EC.element_to_be_clickable((By.ID, "proceed")) ) except: load(url)def save_video(driver, filename): video_element = driver.find_element_by_tag_name( "video") # get the video element video_url = video_element.get_property('src') # get the video url # trying to download the video urllib.request.urlretrieve(video_url, filename) # ERROR: "urllib.error.HTTPError: HTTP Error 401: Unauthorized"def main(): URL = r'https://www.sdarot.dev/watch/339-%D7%94%D7%A4%D7%99%D7%92-%D7%9E%D7%95%D7%AA-ha-pijamot/season/1/episode/23' DRIVER = webdriver.Chrome() load(DRIVER, URL) video_url = save_video(DRIVER, "video.mp4")if __name__ == "__main__": main()

查看完整描述

1 回答

慕哥6287543

TA貢獻1831條經(jīng)驗獲得超10個贊

您收到未經(jīng)授權(quán)的錯誤，因為他們使用 cookie 來存儲與您的會話相關(guān)的一些信息。具體來說，cookie 名為Sdarot. 我已經(jīng)使用requests庫來下載并保存視頻。

要點是，當您使用 selenium 打開 url 時，它工作正常，因為 selenium 使用相同的 http 客戶端（瀏覽器），該客戶端已經(jīng)具有可用的 cookie 詳細信息，但是當您使用 urllib 調(diào)用時，基本上它是不同的 http 客戶端，因此它是對服務器。為了克服這個問題，您必須像瀏覽器一樣提供足夠的會話信息，在本例中由 cookie 維護。

檢查我如何提取Sdarotcookie 的值并將其應用到requests.get方法中。您也可以使用來做到這一點urllib。

from selenium import webdriver

from selenium.webdriver.common.by import By

from selenium.webdriver.support.ui import WebDriverWait

from selenium.webdriver.support import expected_conditions as EC

import requests

def load(driver, url):

driver.get(url) # open the page in the browser

try:

# wait for the episode to "load"

# if something is wrong and the episode doesn't load after 45 seconds,

# the function will call itself again and try to load again.

continue_btn = WebDriverWait(driver, 45).until(

EC.element_to_be_clickable((By.ID, "proceed"))

)

continue_btn.click()

except:

load(driver,url) #corrected parameter error

def save_video(driver, filename):

video_element = driver.find_element_by_tag_name(

"video") # get the video element

video_url = video_element.get_property('src') # get the video url

cookies = driver.get_cookies()

#iterate all the cookies and extract cookie value named Sdarot

for entry in cookies:

if(entry["name"] == 'Sdarot'):

cookies = dict({entry["name"]:entry["value"]})

#set request with proper cookies

r = requests.get(video_url, cookies=cookies,stream = True)

# start download

with open(filename, 'wb') as f:

for chunk in r.iter_content(chunk_size = 1024*1024):

if chunk:

f.write(chunk)

def main():

URL = r'https://www.sdarot.dev/watch/339-%D7%94%D7%A4%D7%99%D7%92-%D7%9E%D7%95%D7%AA-ha-pijamot/season/1/episode/23'

DRIVER = webdriver.Chrome()

load(DRIVER, URL)

video_url = save_video(DRIVER, "video.mp4")

if __name__ == "__main__":

main()

反對回復 2023-10-05

1 回答
0 關(guān)注
102 瀏覽

關(guān)注

添加回答

舉報

0/150

提交

取消

使用 Ctrl+D 可將網(wǎng)站添加到書簽

微信客服

購課補貼
聯(lián)系客服咨詢優(yōu)惠詳情

幫助反饋 APP下載

慕課網(wǎng)APP
您的移動學習伙伴

公眾號

掃描二維碼
關(guān)注慕課網(wǎng)微信公眾號

第七色在线视频,2021少妇久久久久久久久久,亚洲欧洲精品成人久久av18,亚洲国产精品特色大片观看完整版,孙宇晨将参加特朗普的晚宴

熱搜

最近搜索清空

使用 selenium 下載“401 Unauthorized”視頻

使用 selenium 下載“401 Unauthorized”視頻

1 回答

添加回答