简体   繁体   English

如何从管道访问scrapy统计信息

[英]How to access scrapy stats from a pipeline

From the scrapy api I know that a crawler contains the stats attribute, but how can I access it from a custom pipeline? 从scrapy api我知道爬虫包含stats属性,但是如何从自定义管道访问它?

class MyPipeline(object):

    def __init__(self): 
        self.stats = ???

Your pipeline is an extension and you want it to access the stats attribute. 您的管道是一个扩展,您希望它访问stats属性。 An extension receives the Crawler object through the from_crawler(cls, crawler) method. 扩展通过from_crawler(cls, crawler)方法接收Crawler对象。

All in all, you should do something like 总而言之,你应该做点什么

def __init__(self, stats):
    self.stats = stats

@classmethod
def from_crawler(cls, crawler):
    return cls(crawler.stats)

http://scrapy.readthedocs.org/en/latest/topics/stats.html#topics-stats http://scrapy.readthedocs.org/en/latest/topics/stats.html#topics-stats

also stats available from spider.crawler, for example (v1.1.0): 还有spider.crawler提供的统计数据,例如(v1.1.0):

class ObjPipeline(object):
    def process_item(self, item, spider):
        spider.crawler.stats.inc_value('scraped_items')
        ...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM