简体   繁体   中英

How does Scrapy find Spider class by its name?

Say I have This spider:

class SomeSPider(Spider):
     name ='spname'

Then I can crawl my spider, by creating a new instance of SomeSpider and call the crawler like this for example:

spider= SomeSpider()
crawler = Crawler(settings)
crawler.configure()
crawler.crawl(spider)
....

Can I Do the same thing using just the spider name? I mean 'spname' ?

crawler.crawl('spname') ## I give just the spider name here

How to dynamically create the Spider ? I guess the scrapy manager do it internally, since this works fine:

Scrapy crawl spname   

One solution, is to parse my spiders folders , get all Spiders classes and filter them using name attribute? but this looks like a far-fetched solution!

Thank you in advance for your help.

Please take a look at the source code:

# scrapy/commands/crawl.py

class Command(ScrapyCommand):

    def run(self, args, opts):
        ...

# scrapy/spidermanager.py

class SpiderManager(object):

    def _load_spiders(self, module):
        ...

    def create(self, spider_name, **spider_kwargs):
        ...

# scrapy/utils/spider.py

def iter_spider_classes(module):
    """Return an iterator over all spider classes defined in the given module
    that can be instantiated (ie. which have name)
    """
    ...

Inspired by @kev answer, here a function that inspect spider class:

from scrapy.utils.misc import walk_modules
from scrapy.utils.spider import iter_spider_classes

def _load_spiders(module='spiders.SomeSpider'):
    for module in walk_modules(module):
        for spcls in iter_spider_classes(module):
            self._spiders[spcls.name] = spcls

Then you can instantiate :

somespider = self._spiders['spname']()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM