1 回答

TA貢獻1831條經(jīng)驗 獲得超10個贊
您收到未經(jīng)授權(quán)的錯誤,因為他們使用 cookie 來存儲與您的會話相關(guān)的一些信息。具體來說,cookie 名為Sdarot. 我已經(jīng)使用requests庫來下載并保存視頻。
要點是,當您使用 selenium 打開 url 時,它工作正常,因為 selenium 使用相同的 http 客戶端(瀏覽器),該客戶端已經(jīng)具有可用的 cookie 詳細信息,但是當您使用 urllib 調(diào)用時,基本上它是不同的 http 客戶端,因此它是對服務器。為了克服這個問題,您必須像瀏覽器一樣提供足夠的會話信息,在本例中由 cookie 維護。
檢查我如何提取Sdarotcookie 的值并將其應用到requests.get方法中。您也可以使用來做到這一點urllib。
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import requests
def load(driver, url):
driver.get(url) # open the page in the browser
try:
# wait for the episode to "load"
# if something is wrong and the episode doesn't load after 45 seconds,
# the function will call itself again and try to load again.
continue_btn = WebDriverWait(driver, 45).until(
EC.element_to_be_clickable((By.ID, "proceed"))
)
continue_btn.click()
except:
load(driver,url) #corrected parameter error
def save_video(driver, filename):
video_element = driver.find_element_by_tag_name(
"video") # get the video element
video_url = video_element.get_property('src') # get the video url
cookies = driver.get_cookies()
#iterate all the cookies and extract cookie value named Sdarot
for entry in cookies:
if(entry["name"] == 'Sdarot'):
cookies = dict({entry["name"]:entry["value"]})
#set request with proper cookies
r = requests.get(video_url, cookies=cookies,stream = True)
# start download
with open(filename, 'wb') as f:
for chunk in r.iter_content(chunk_size = 1024*1024):
if chunk:
f.write(chunk)
def main():
URL = r'https://www.sdarot.dev/watch/339-%D7%94%D7%A4%D7%99%D7%92-%D7%9E%D7%95%D7%AA-ha-pijamot/season/1/episode/23'
DRIVER = webdriver.Chrome()
load(DRIVER, URL)
video_url = save_video(DRIVER, "video.mp4")
if __name__ == "__main__":
main()
添加回答
舉報