简体   繁体   中英

Scrapy. How to send item to close_spider method in pipeline

I yield and process many items and in some cases, I update the tracking sheet. This tracking contains several attributes, including country, all these attributes come from the item. All these operations are going in the pipeline. After the spider is closed I have to send this tracking to responsible people by country. But I can't send the item to the method where I catch a closing spider

To catch this moment I use this:

@classmethod
def from_crawler(cls, crawler):
    temp = cls()
    crawler.signals.connect(temp.customize_close_spider, signal=signals.spider_closed)
    return temp

def customize_close_spider(self, **kwargs):
    reason = kwargs.get("reason") 
    spider = kwargs.get("spider")
    if reason == "finished":
        #some action

I can send the item neither to from_crawler nor customize_close_spider. I need it in order to get the country attribute from the item.

Maybe there is another way to send a signal, for example, to another method that I can call from the tracking method

The spider_closed method is only executed once, at the end of the scraping. If you need to execute something for every item, you can use the process_item method (which is executed for every item).

In case you need to wait until all items have been scraped, you can write all items to a file ( doc ), and read from this file in spider_closed.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM