![](/img/trans.png)
[英]ReactorNotRestartable error, how to use scrapy CrawlerProcess in for loop
[英]Scrapy Infinite loop with CrawlerProcess
我目前正在運行 Scrapy v2.5,我想運行無限循環。 我的代碼:
class main():
def bucle(self, array_spyder, process):
mongo = mongodb(setting)
for spider_name in array_spider:
process_init.crawl(spider_name, params={ "mongo": mongo, "spider_name": spider_name})
process.start()
process.stop()
mongo.close_mongo()
if __name__ == "__main__":
setting = get_project_settings()
while True:
process = CrawlerProcess(setting)
array_spider = process.spider_loader.list()
class_main = main()
class_main.bucle(array_spider, process)
但這導致錯誤消息如下:
Traceback (most recent call last):
File "run_scrapy.py", line 92, in <module>
process.start()
File "/usr/local/lib/python3.8/dist-packages/scrapy/crawler.py", line 327, in start
reactor.run(installSignalHandlers=False) # blocking call
File "/usr/local/lib/python3.8/dist-packages/twisted/internet/base.py", line 1422, in run
self.startRunning(installSignalHandlers=installSignalHandlers)
File "/usr/local/lib/python3.8/dist-packages/twisted/internet/base.py", line 1404, in startRunning
ReactorBase.startRunning(cast(ReactorBase, self))
File "/usr/local/lib/python3.8/dist-packages/twisted/internet/base.py", line 843, in startRunning
raise error.ReactorNotRestartable()
twisted.internet.error.ReactorNotRestartable
誰能幫我??
AFAIK 沒有簡單的方法可以重新啟動蜘蛛,但有一種替代方法 - 永遠不會關閉的蜘蛛。 為此,您可以利用spider_idle
信號。
根據文檔:
Sent when a spider has gone idle, which means the spider has no further:
* requests waiting to be downloaded
* requests scheduled
* items being processed in the item pipeline
您還可以在官方 文檔中找到使用Signals
的示例。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.