首頁(yè) 猿問(wèn) 從數(shù)據(jù)庫(kù) scrapy 中檢索數(shù)據(jù)

從數(shù)據(jù)庫(kù) scrapy 中檢索數(shù)據(jù)

Python

動(dòng)漫人物 2023-01-04 16:27:59

在 scrapy 中，我試圖從數(shù)據(jù)庫(kù)中檢索數(shù)據(jù)，這些數(shù)據(jù)被蜘蛛抓取并添加到 pipelines.py 中的數(shù)據(jù)庫(kù)中。我想讓這個(gè)數(shù)據(jù)在另一個(gè)蜘蛛中使用。具體來(lái)說(shuō)，我想從數(shù)據(jù)庫(kù)中檢索鏈接并在 start_request 函數(shù)中使用它。我知道這里也解釋了這個(gè)問(wèn)題Scrapy: Get Start_Urls from Database by Pipeline我試著通過(guò)這個(gè)例子來(lái)做，但不幸的是它不起作用，我不不知道為什么，但我知道我在某個(gè)地方犯了錯(cuò)誤。piplines.pyimport sqlite3class HeurekaScraperPipeline: def __init__(self): self.create_connection() self.create_table() def create_connection(self): self.conn = sqlite3.connect('shops.db') self.curr = self.conn.cursor() def create_table(self): self.curr.execute("""DROP TABLE IF EXISTS shops_tb""") self.curr.execute("""create table shops_tb( product_name text, shop_name text, price text, link text )""") def process_item(self, item, spider): self.store_db(item) return item def store_db(self, item): self.curr.execute("""insert into shops_tb values (?, ?, ?, ?)""",( item['product_name'], item['shop_name'], item['price'], item['link'], )) self.conn.commit()spiderclass Shops_spider(scrapy.Spider): name = 'shops_scraper' custom_settings = {'DOWNLOAD_DELAY': 1} def start_requests(self): db_cursor = HeurekaScraperPipeline().curr db_cursor.execute("SELECT * FROM shops_tb") links = db_cursor.fetchall() for link in links: url = link[3] print(url) yield scrapy.Request(url=url, callback=self.parse) def parse(self, response): url = response.request.url print('********************************'+url+'************************')預(yù)先感謝您的幫助。

查看完整描述

1 回答

鳳凰求蠱

TA貢獻(xiàn)1825條經(jīng)驗(yàn) 獲得超4個(gè)贊

管道用于處理項(xiàng)目。如果你想從數(shù)據(jù)庫(kù)中讀取一些東西，打開(kāi)連接并在start_request. 根據(jù)文檔：

在一個(gè)項(xiàng)目被蜘蛛抓取后，它被發(fā)送到項(xiàng)目管道，它通過(guò)幾個(gè)順序執(zhí)行的組件來(lái)處理它。

為什么不在 start_request 中打開(kāi) DB 連接？

def start_requests(self):

self.conn = sqlite3.connect('shops.db')

self.curr = self.conn.cursor()

self.curr.execute("SELECT * FROM shops_tb")

links = self.curr.fetchall()

# rest of the code

反對(duì) 回復(fù) 2023-01-04

1 回答
0 關(guān)注
172 瀏覽

關(guān)注

添加回答

舉報(bào)

0/150

提交

取消

使用 Ctrl+D 可將網(wǎng)站添加到書(shū)簽

微信客服

購(gòu)課補(bǔ)貼
聯(lián)系客服咨詢優(yōu)惠詳情

幫助反饋 APP下載

慕課網(wǎng)APP
您的移動(dòng)學(xué)習(xí)伙伴

公眾號(hào)

掃描二維碼
關(guān)注慕課網(wǎng)微信公眾號(hào)

第七色在线视频,2021少妇久久久久久久久久,亚洲欧洲精品成人久久av18,亚洲国产精品特色大片观看完整版,孙宇晨将参加特朗普的晚宴

熱搜

最近搜索清空

從數(shù)據(jù)庫(kù) scrapy 中檢索數(shù)據(jù)

從數(shù)據(jù)庫(kù) scrapy 中檢索數(shù)據(jù)

1 回答

添加回答