简体   繁体   中英

Access response from spider in items pipeline in scrapy

I have spider like

class ProductsSpider(scrapy.Spider):
    name = "products"
    allowed_domains = ["example.com"]
    start_urls = [
        'http://example.com/url'
    ]

    def parse(self, response):

And I have a pipeline class like this

class ProductsDataPipeline(object):
    """ Item pipeline for products data crawler """

    def process_item(self, item, spider):   
        return item

But I want get response argument of parse function in parse_item function without setting as an attribute to item object,Is it possible

No it's not possible.

Responses are not forwarded to pipelines. You either have to store response in item or use some external storage to store response and fetch it in pipeline. Second option is much better, and avoids many problems that can result from storing response in item (eg memory problems). For example you save response to some form of storage in parse callback, save reference to this storage in item field, and fetch response from storage in pipeline.

But it really depends on what you are trying to do, response is available in spider middleware process_spider_output so perhaps you can use it instead of processing item in pipeline.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM