第七色在线视频,2021少妇久久久久久久久久,亚洲欧洲精品成人久久av18,亚洲国产精品特色大片观看完整版,孙宇晨将参加特朗普的晚宴

為了賬號安全,請及時綁定郵箱和手機立即綁定
已解決430363個問題,去搜搜看,總會有你想問的

Scrapy 不解析項目

Scrapy 不解析項目

拉風(fēng)的咖菲貓 2021-12-21 17:26:15
我正在嘗試使用 pegination 抓取網(wǎng)頁,但回電不解析項目,任何幫助將不勝感激....這里是代碼# -*- coding: utf-8 -*-import scrapyfrom ..items import EscrotsItemclass Escorts(scrapy.Spider):    name = 'escorts'    allowed_domains = ['www.escortsandbabes.com.au']    start_urls = ['https://escortsandbabes.com.au/Directory/ACT/Canberra/2600/Any/All/']    def parse_links(self, response):        for i in response.css('.btn.btn-default.btn-block::attr(href)').extract()[2:]:            yield scrapy.Request(url=response.urljoin(i),callback=self.parse)        NextPage = response.css('.page.next-page::attr(href)').extract_first()        if NextPage:            yield scrapy.Request(                url=response.urljoin(NextPage),                callback=self.parse_links)    def parse(self, response):        for x in response.xpath('//div[@class="advertiser-profile"]'):            item = EscrotsItem()            item['Name'] = x.css('.advertiser-names--display-name::text').extract_first()            item['Username'] = x.css('.advertiser-names--username::text').extract_first()            item['Phone'] = x.css('.contact-number::text').extract_first()            yield item
查看完整描述

1 回答

?
溫溫醬

TA貢獻1752條經(jīng)驗 獲得超4個贊

您的代碼調(diào)用 urlsstart_urls并parse運行。由于沒有任何div.advertiser-profile元素,它確實應(yīng)該在沒有任何結(jié)果的情況下關(guān)閉。所以你的parse_links函數(shù)根本沒有被調(diào)用。


更改函數(shù)名稱:


import scrapy



class Escorts(scrapy.Spider):

    name = 'escorts'

    allowed_domains = ['escortsandbabes.com.au']

    start_urls = ['https://escortsandbabes.com.au/Directory/ACT/Canberra/2600/Any/All/']


    def parse(self, response):

        for i in response.css('.btn.btn-default.btn-block::attr(href)').extract()[2:]:

            yield scrapy.Request(response.urljoin(i), self.parse_links)

        next_page = response.css('.page.next-page::attr(href)').get()

        if next_page:

            yield scrapy.Request(response.urljoin(next_page))


    def parse_links(self, response):

        for x in response.xpath('//div[@class="advertiser-profile"]'):

            item = {}

            item['Name'] = x.css('.advertiser-names--display-name::text').get()

            item['Username'] = x.css('.advertiser-names--username::text').get()

            item['Phone'] = x.css('.contact-number::text').get()

            yield item

我來自scrapy shell的日志:


In [1]: fetch("https://escortsandbabes.com.au/Directory/ACT/Canberra/2600/Any/All/")

2019-03-29 15:22:56 [scrapy.core.engine] INFO: Spider opened

2019-03-29 15:23:00 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://escortsandbabes.com.au/Directory/ACT/Canberra/2600/Any/All/> (referer: None, latency: 2.48 s)


In [2]: response.css('.page.next-page::attr(href)').get()

Out[2]: u'/Directory/ACT/Canberra/2600/Any/All/?p=2'


查看完整回答
反對 回復(fù) 2021-12-21
  • 1 回答
  • 0 關(guān)注
  • 173 瀏覽
慕課專欄
更多

添加回答

舉報

0/150
提交
取消
微信客服

購課補貼
聯(lián)系客服咨詢優(yōu)惠詳情

幫助反饋 APP下載

慕課網(wǎng)APP
您的移動學(xué)習(xí)伙伴

公眾號

掃描二維碼
關(guān)注慕課網(wǎng)微信公眾號