@Sjaak Trekhaak has a 'hack' here How do I stop all spiders and the engine immediately after a condition in a pipeline is met? that can potentially stop the spiders by setting a flag in pipeline, and then call CloseSpider in the parser method. However I have the following code in pipeline (where pdate and lastseen are well defined datetime):
class StopSpiderPipeline(object):
def process_item(self, item, spider):
if pdate < lastseen:
spider.close_down = True
and in spider
def parse_item(self, response):
if self.close_down:
raise CloseSpider(reason='Already scraped')
I got error exceptions.AttributeError: 'SyncSpider' object has no attribute 'close_down'
, where did I get wrong? the question was actually asked by @anicake but was not responded. Thanks,
Is your spider's close_down
attribute create? Because it looks like it doesn't.
Try changing your check to if "close_down" in self.__dict__:
or adding self.close_down = False
in your spider's __init__()
method.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.