Python 日志与性能

Question

我在我的一个程序中使用 python Logger。

该程序是一个 np-hard 问题的求解器，因此使用多次运行的深度迭代。

我的问题是 Logger 是否会影响我的程序的性能，以及是否有更好的方法来记录信息以保持性能。

Answer 1

根据您的 Logger 配置和您的程序生成的日志数量，是的，由于阻塞 Logger 操作，日志记录可能成为性能瓶颈。 例如，当直接从响应时间较慢的 NFS 服务器登录到 NFS 文件时。 在这种情况下提高性能的一种可能方法是切换到使用能够缓冲并可能批量记录操作的日志服务器 - 阻塞将仅限于与日志服务器的通信，而不是（慢）日志文件访问，这通常是从性能的角度来看更好。

Answer 2

我在使用两个不同的日志文件时有很好的经验。

server.log 文件供操作员使用，仅接收重要消息，通常是 INFO、WARNING、ERROR、CRITICAL。
供开发人员分析错误的 debug.log 文件。 它最多包含 100 条来自 ERROR 发生前时间线程的 DEBUG 消息。

对于第二个文件，我使用仅在程序检测到错误时才写入文件的线程本地环形缓冲区。 因此 server.log 文件仍然很小，但开发人员可以获得足够的调试消息来分析问题。 如果没有出现问题，那么这两个文件都是完全空的，因此不会损害性能。 当然，缓冲区会消耗内存和一点 CPU 能力，但这是可以接受的。

这是我在 Odoo（这是一个 Python 应用程序）中使用的示例实现：

import logging, collections, time

class LogBuffer(logging.Handler):
    """Buffer debug messages per thread and write them out when an error (or warning) occurs"""

    def __init__(self, target_handler, threshold, max_buffered_messages, max_buffer_seconds):
        logging.Handler.__init__(self, logging.DEBUG)
        self.tread_buffers = dict()  # stores one buffer for each thread (key=thread number)
        self.target_handler = target_handler
        self.threshold = threshold
        self.max_buffered_messages = max_buffered_messages
        self.last_check_time = time.time()
        self.max_buffer_seconds = max_buffer_seconds

    def emit(self, record):
        """Do whatever it takes to actually log the specified logging record."""

        # Create a thread local buffer, if not already exists
        if record.thread not in self.tread_buffers:
            thread_buffer = self.tread_buffers[record.thread] = collections.deque()
        else:
            thread_buffer = self.tread_buffers[record.thread]

        # Put the log record into the buffer
        thread_buffer.append(record)

        # If the buffer became to large, then remove the oldest entry
        if len(thread_buffer) > self.max_buffered_messages:
            thread_buffer.popleft()

        # produce output if the log level is high enough
        if record.levelno >= self.threshold:
            for r in thread_buffer:
                self.target_handler.emit(r)
            thread_buffer.clear()

        # remove very old messages from all buffers once per minute
        now = time.time()
        elapsed = now - self.last_check_time
        if elapsed > 60:
            # Iterate over all buffers
            for key, buffer in list(self.tread_buffers.items()):
                # Iterate over the content of one buffer
                for r in list(buffer):
                    age = now - r.created
                    if age > self.max_buffer_seconds:
                        buffer.remove(r)
                # If the buffer is now empty, then remove it
                if not buffer:
                    del self.tread_buffers[key]
            self.last_check_time = now

如何创建/配置这样的记录器的示例：

import logging
from . import logbuffer

"""
    Possible placeholders for the formatter:

    %(name)s            Name of the logger (logging channel)
    %(levelno)s         Numeric logging level for the message (DEBUG, INFO,
                        WARNING, ERROR, CRITICAL)
    %(levelname)s       Text logging level for the message ("DEBUG", "INFO",
                        "WARNING", "ERROR", "CRITICAL")
    %(pathname)s        Full pathname of the source file where the logging
                        call was issued (if available)
    %(filename)s        Filename portion of pathname
    %(module)s          Module (name portion of filename)
    %(lineno)d          Source line number where the logging call was issued
                        (if available)
    %(funcName)s        Function name
    %(created)f         Time when the LogRecord was created (time.time()
                        return value)
    %(asctime)s         Textual time when the LogRecord was created
    %(msecs)d           Millisecond portion of the creation time
    %(relativeCreated)d Time in milliseconds when the LogRecord was created,
                        relative to the time the logging module was loaded
                        (typically at application startup time)
    %(thread)d          Thread ID (if available)
    %(threadName)s      Thread name (if available)
    %(process)d         Process ID (if available)
    %(message)s         The result of record.getMessage(), computed just as
                        the record is emitted
"""

# Log levels are: CRITICAL, ERROR, WARNING, INFO, DEBUG

# Specify the output format
formatter = logging.Formatter('%(asctime)-15s %(thread)20d %(levelname)-8s %(name)s %(message)s')

# Create server.log
server_log = logging.FileHandler('../log/server.log')
server_log.setLevel(logging.INFO)
server_log.setFormatter(formatter)
logging.root.addHandler(server_log)

# Create debug.log
debug_log = logging.FileHandler('../log/debug.log')
debug_log.setFormatter(formatter)
memory_handler = logbuffer.LogBuffer(debug_log, threshold=logging.ERROR, max_buffered_messages=100, max_buffer_seconds=600)
logging.root.addHandler(memory_handler)

# Specify log levels for individual packages
logging.getLogger('odoo.addons').setLevel(logging.DEBUG)

# The default log level for all other packages
logging.root.setLevel(logging.INFO)

如果您觉得这有帮助，请告诉我。 我在 Python 方面处于非常初级的水平，但我在 Java 和 C++ 中已经成功运行了多年。

Python 日志与性能

问题描述

2 个解决方案

解决方案1
4 已采纳 2015-11-15 05:27:16

解决方案2
1 2020-02-06 17:25:09

Python 日志与性能

问题描述

2 个解决方案

解决方案1 4 已采纳 2015-11-15 05:27:16

解决方案2 1 2020-02-06 17:25:09

解决方案1
4 已采纳 2015-11-15 05:27:16

解决方案2
1 2020-02-06 17:25:09