简体   繁体   中英

Logging to specific error log file in scrapy

I am running a log of scrapy by doing this:

from scrapy import log
class MySpider(BaseSpider):
  name = "myspider"  

  def __init__(self, name=None, **kwargs):
        LOG_FILE = "logs/spider.log"
        log.log.defaultObserver = log.log.DefaultObserver()
        log.log.defaultObserver.start()
        log.started = False
        log.start(LOG_FILE, loglevel=log.INFO)
        super(MySpider, self).__init__(name, **kwargs)

    def parse(self,response):
        ....
        raise Exception("Something went wrong!")
        log.msg('Something went wrong!', log.ERROR)

        # Somehow write to a separate error log here.

Then I run the spider like this:

scrapy crawl myspider

This would store all the log.INFO data as well as log.ERROR into spider.log .

If an error occurs, I would also like to store those details in a separate log file called spider_errors.log . It would make it easier to search for errors that occurred rather than trying to scan through the entire spider.log file (which could be huge).

Is there a way to do this?

EDIT:

Trying with PythonLoggingObserver:

def __init__(self, name=None, **kwargs):
        LOG_FILE = 'logs/spider.log'
        ERR_File = 'logs/spider_error.log'

        observer = log.log.PythonLoggingObserver()
        observer.start()

        log.started = False     
        log.start(LOG_FILE, loglevel=log.INFO)
        log.start(ERR_FILE, loglevel=log.ERROR)

But I get ERROR: No handlers could be found for logger "twisted"

Just let logging do the job. Try to use PythonLoggingObserver instead of DefaultObserver :

  • configure two loggers (one for INFO and one for ERROR messages) directly in python, or via fileconfig, or via dictconfig (see docs )
  • start it in spider's __init__ :

     def __init__(self, name=None, **kwargs): # TODO: configure logging: eg logging.config.fileConfig("logging.conf") observer = log.PythonLoggingObserver() observer.start() 

Let me know if you need help with configuring loggers.

EDIT:

Another option is to start two file log observers in __init__.py :

from scrapy.log import ScrapyFileLogObserver
from scrapy import log


class MySpider(BaseSpider):
    name = "myspider"  

    def __init__(self, name=None, **kwargs):
        ScrapyFileLogObserver(open("spider.log", 'w'), level=logging.INFO).start()
        ScrapyFileLogObserver(open("spider_error.log", 'w'), level=logging.ERROR).start()

        super(MySpider, self).__init__(name, **kwargs)

    ...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM