简体   繁体   中英

Is there a way to process scrapy.Request object in the shell?

In the terminal, I ran

scrapy startproject tutorial

I created the following spider in the spiders folder

import scrapy

class QuotesSpider(scrapy.Spider):
    name = "quotes"
    start_urls = ['http://quotes.toscrape.com/page/1/']

In the terminal, I ran

scrapy shell 'http://quotes.toscrape.com/page/1/'

This all works fine as in the Python shell that opens up I get

>>> response
<200 http://quotes.toscrape.com/page/1/>

Now, I ran

>>> next_page = response.css('li.next a::attr(href)').extract_first()
>>> next_page
'/page/2/'

>>> response.follow(next_page)
<GET http://quotes.toscrape.com/page/2/>

>>> type(response.follow(next_page))
<class 'scrapy.http.request.Request'>

I would like to get a new Response object in the shell, based on the link to next_page . Is this possible at all? Any help very much appreciated.

I tried the below already, but couldn't fix the error.

>>> scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware.process_request(response.follow(next_page), "quotes")
Traceback (most recent call last):
  File "<console>", line 1, in <module>
TypeError: process_request() missing 1 required positional argument: 'spider'

使用fetch()

>>> fetch(response.follow(next_page))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM