简体   繁体   中英

Scrapy: Can't override __init__function

I have created a spider which inherits from CrawlSpider.

I need to use the __init__ function but always getting this error:

code:

class mySpider(CrawlSpider):

 def __init__(self):
   super(mySpider, self).__init__()
     .....

this is the error I'm getting: KeyError Spider not found: mySpider.

without the __init__ function everything works

You need to put it like this:

def __init__(self, *a, **kw):
    super(MySpider, self).__init__(*a, **kw)
    # your code here

Working example:

class MySpider(CrawlSpider):
    name = "company"
    allowed_domains = ["site.com"]
    start_urls = ["http://www.site.com"]

    def __init__(self, *a, **kw):
        super(MySpider, self).__init__(*a, **kw)
        dispatcher.connect(self.spider_closed, signals.spider_closed)

Here init was used to register scrapy signals in spider, I needed it in this example in spider instead of usually in pipeline

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM