1 回答

TA貢獻(xiàn)1831條經(jīng)驗(yàn) 獲得超10個(gè)贊
您收到未經(jīng)授權(quán)的錯(cuò)誤,因?yàn)樗麄兪褂?cookie 來存儲(chǔ)與您的會(huì)話相關(guān)的一些信息。具體來說,cookie 名為Sdarot. 我已經(jīng)使用requests庫(kù)來下載并保存視頻。
要點(diǎn)是,當(dāng)您使用 selenium 打開 url 時(shí),它工作正常,因?yàn)?selenium 使用相同的 http 客戶端(瀏覽器),該客戶端已經(jīng)具有可用的 cookie 詳細(xì)信息,但是當(dāng)您使用 urllib 調(diào)用時(shí),基本上它是不同的 http 客戶端,因此它是對(duì)服務(wù)器。為了克服這個(gè)問題,您必須像瀏覽器一樣提供足夠的會(huì)話信息,在本例中由 cookie 維護(hù)。
檢查我如何提取Sdarotcookie 的值并將其應(yīng)用到requests.get方法中。您也可以使用來做到這一點(diǎn)urllib。
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import requests
def load(driver, url):
driver.get(url) # open the page in the browser
try:
# wait for the episode to "load"
# if something is wrong and the episode doesn't load after 45 seconds,
# the function will call itself again and try to load again.
continue_btn = WebDriverWait(driver, 45).until(
EC.element_to_be_clickable((By.ID, "proceed"))
)
continue_btn.click()
except:
load(driver,url) #corrected parameter error
def save_video(driver, filename):
video_element = driver.find_element_by_tag_name(
"video") # get the video element
video_url = video_element.get_property('src') # get the video url
cookies = driver.get_cookies()
#iterate all the cookies and extract cookie value named Sdarot
for entry in cookies:
if(entry["name"] == 'Sdarot'):
cookies = dict({entry["name"]:entry["value"]})
#set request with proper cookies
r = requests.get(video_url, cookies=cookies,stream = True)
# start download
with open(filename, 'wb') as f:
for chunk in r.iter_content(chunk_size = 1024*1024):
if chunk:
f.write(chunk)
def main():
URL = r'https://www.sdarot.dev/watch/339-%D7%94%D7%A4%D7%99%D7%92-%D7%9E%D7%95%D7%AA-ha-pijamot/season/1/episode/23'
DRIVER = webdriver.Chrome()
load(DRIVER, URL)
video_url = save_video(DRIVER, "video.mp4")
if __name__ == "__main__":
main()
添加回答
舉報(bào)