第七色在线视频,2021少妇久久久久久久久久,亚洲欧洲精品成人久久av18,亚洲国产精品特色大片观看完整版,孙宇晨将参加特朗普的晚宴

為了賬號安全,請及時(shí)綁定郵箱和手機(jī)立即綁定
已解決430363個(gè)問題,去搜搜看,總會有你想問的

CrawlSpider爬取拉勾被重定向,我已經(jīng)用selenium拿到了cookies,求教大神,幫幫我吧

CrawlSpider爬取拉勾被重定向,我已經(jīng)用selenium拿到了cookies,求教大神,幫幫我吧

Mr1011 2018-06-21 23:58:52
我的代碼如下:#?-*-?coding:?utf-8?-*- import?scrapy from?scrapy.linkextractors?import?LinkExtractor from?scrapy.spiders?import?CrawlSpider,?Rule from?..utils.common?import?login_lagou from?scrapy.http?import?Request class?LagouSpider(CrawlSpider): ????name?=?'lagou' ????allowed_domains?=?['www.lagou.com'] ????start_urls?=?['https://www.lagou.com/'] ????rules?=?( ????????Rule(LinkExtractor(allow=("zhaopin/.*",)),?follow=True),????????????????????????????#?招聘的Rule ????????Rule(LinkExtractor(allow=("gongsi/\d+.html",)),?follow=True),???????????????????????#?公司的Rule ????????Rule(LinkExtractor(allow=(r'jobs/\d+.html',)),?callback='parse_job',?follow=True),??#?具體職位的Rule ????) ????headers?=?{ ????????"Host":?'passport.lagou.com', ????????"Origin":?'https://passport.lagou.com', ????????"Referer":?'https://passport.lagou.com/login/login.html', ????????"User-Agent":?'Mozilla/5.0?(Windows?NT?10.0;?WOW64)?AppleWebKit/537.36?(KHTML,?like?Gecko)?' ??????????????????????'Chrome/67.0.3396.87?Safari/537.36', ????????"X-Requested-With":?"XMLHttpRequest", ????????"Content-Type":?'application/x-www-form-urlencoded;charset=UTF-8' ????} ????def?start_requests(self): ????????self.cookies?=?login_lagou() ????????print(self.cookies) ????????self.headers.update({ ????????????"Cookie":?self.cookies ????????}) ????????print(self.headers) ????????yield?Request(url=self.start_urls[0], ??????????????????????cookies=self.cookies, ??????????????????????headers=self.headers, ??????????????????????callback=self.parse, ??????????????????????dont_filter=True) ????#?def?parse_start_url(self,?response): ????#?????return?[] ????# ????#?def?process_results(self,?response,?results): ????#?????return?results ????def?parse_job(self,?response): ????????#?解析拉鉤網(wǎng)的職位 ????????i?=?{} ????????print(response) ????????#i['domain_id']?=?response.xpath('//input[@id="sid"]/@value').extract() ????????#i['name']?=?response.xpath('//div[@id="name"]').extract() ????????#i['description']?=?response.xpath('//div[@id="description"]').extract() ????????return?isetting.py的配置文件HTTPERROR_ALLOWED_CODES?=?[302] #?HTTPERROR_ALLOWED_CODES?=?[400] COOKIES_ENABLED?=?False REDIRECT_ENABLED?=?False???#?禁止重定向 DOWNLOAD_DELAY?=?6??????#?設(shè)置時(shí)間間隔為6s,防止被禁 DOWNLOAD_TIMEOUT?=?10???#?設(shè)置超時(shí)時(shí)間 RETRY_ENABLED?=?True????#?設(shè)置開啟重試 RETRY_TIMES?=?3?????????#?設(shè)置重試次數(shù)
查看完整描述

1 回答

  • 1 回答
  • 0 關(guān)注
  • 2009 瀏覽
慕課專欄
更多

添加回答

舉報(bào)

0/150
提交
取消
微信客服

購課補(bǔ)貼
聯(lián)系客服咨詢優(yōu)惠詳情

幫助反饋 APP下載

慕課網(wǎng)APP
您的移動學(xué)習(xí)伙伴

公眾號

掃描二維碼
關(guān)注慕課網(wǎng)微信公眾號