有沒有辦法在shell中處理scrapy.Request對象？

Question

在終端，我跑了

scrapy startproject tutorial

我在spiders文件夾中創建了以下蜘蛛

import scrapy

class QuotesSpider(scrapy.Spider):
    name = "quotes"
    start_urls = ['http://quotes.toscrape.com/page/1/']

在終端，我跑了

scrapy shell 'http://quotes.toscrape.com/page/1/'

這一切都很好，就像在打開的 Python shell 中一樣，我得到了

>>> response
<200 http://quotes.toscrape.com/page/1/>

現在，我跑了

>>> next_page = response.css('li.next a::attr(href)').extract_first()
>>> next_page
'/page/2/'

>>> response.follow(next_page)
<GET http://quotes.toscrape.com/page/2/>

>>> type(response.follow(next_page))
<class 'scrapy.http.request.Request'>

我想根據指向next_page的鏈接在 shell 中獲取一個新的Response對象。 這可能嗎？ 非常感謝任何幫助。

我已經嘗試了以下方法，但無法修復錯誤。

>>> scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware.process_request(response.follow(next_page), "quotes")
Traceback (most recent call last):
  File "<console>", line 1, in <module>
TypeError: process_request() missing 1 required positional argument: 'spider'

Answer 1

使用fetch() ：

>>> fetch(response.follow(next_page))

有沒有辦法在shell中處理scrapy.Request對象？

問題描述

1 個解決方案

解決方案1
1 已采納 2017-08-11 09:38:33

有沒有辦法在shell中處理scrapy.Request對象？

問題描述

1 個解決方案

解決方案1 1 已采納 2017-08-11 09:38:33

解決方案1
1 已采納 2017-08-11 09:38:33