[英]How does Scrapy find Spider class by its name?
說我有這個蜘蛛:
class SomeSPider(Spider):
name ='spname'
然后,我可以通過創建SomeSpider的新實例來爬行我的蜘蛛,並像這樣調用爬行器:
spider= SomeSpider()
crawler = Crawler(settings)
crawler.configure()
crawler.crawl(spider)
....
我可以只使用蜘蛛名做同樣的事情嗎? 我的意思是“ spname”?
crawler.crawl('spname') ## I give just the spider name here
如何動態創建Spider? 我猜這是負責任的經理在內部執行此操作,因為它可以正常工作:
Scrapy crawl spname
一種解決方案是解析我的Spiders文件夾,獲取所有Spiders類並使用name屬性對其進行過濾? 但這似乎是一個牽強的解決方案!
預先感謝您的幫助。
請看一下源代碼:
# scrapy/commands/crawl.py
class Command(ScrapyCommand):
def run(self, args, opts):
...
# scrapy/spidermanager.py
class SpiderManager(object):
def _load_spiders(self, module):
...
def create(self, spider_name, **spider_kwargs):
...
# scrapy/utils/spider.py
def iter_spider_classes(module):
"""Return an iterator over all spider classes defined in the given module
that can be instantiated (ie. which have name)
"""
...
受到@kev答案的啟發,這是一個檢查Spider類的函數:
from scrapy.utils.misc import walk_modules
from scrapy.utils.spider import iter_spider_classes
def _load_spiders(module='spiders.SomeSpider'):
for module in walk_modules(module):
for spcls in iter_spider_classes(module):
self._spiders[spcls.name] = spcls
然后您可以實例化:
somespider = self._spiders['spname']()
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.