Python 日志与性能

Question

I'm using python Logger in one of my programs.我在我的一个程序中使用 python Logger。

The program is a solver for an np-hard problem and therefore uses deep iterations that run several times.该程序是一个 np-hard 问题的求解器，因此使用多次运行的深度迭代。

My question is if the Logger can be an issue in the performance of my program and if there are better ways to log information maintaining performance.我的问题是 Logger 是否会影响我的程序的性能，以及是否有更好的方法来记录信息以保持性能。

Answer 1

Depending on your Logger configuration and the amount of logs your program produces, yes, logging can be a performance bottleneck because of the blocking Logger operation.根据您的 Logger 配置和您的程序生成的日志数量，是的，由于阻塞 Logger 操作，日志记录可能成为性能瓶颈。 For example when directly logging to an NFS file from a NFS server with slow response time.例如，当直接从响应时间较慢的 NFS 服务器登录到 NFS 文件时。 One possible approach to improve performance in such case would be switching to use of a logserver able to buffer and possibly batch logging operations - the blocking would be limited to the communication with the logserver, not to the (slow) logfile access, which is often better from the performance prospective.在这种情况下提高性能的一种可能方法是切换到使用能够缓冲并可能批量记录操作的日志服务器 - 阻塞将仅限于与日志服务器的通信，而不是（慢）日志文件访问，这通常是从性能的角度来看更好。

Answer 2

I had very good experience using two different logfiles.我在使用两个不同的日志文件时有很好的经验。

The server.log file is for the operator and receives only important messages, usually INFO, WARNING, ERROR, CRITICAL. server.log 文件供操作员使用，仅接收重要消息，通常是 INFO、WARNING、ERROR、CRITICAL。
The debug.log file for the developer to analyze errors.供开发人员分析错误的 debug.log 文件。 It contains up to 100 DEBUG message from the thread of the time before an ERROR occurred.它最多包含 100 条来自 ERROR 发生前时间线程的 DEBUG 消息。

For the second file, I use thread-local ring-buffers that are only written to a file when the program detects an error.对于第二个文件，我使用仅在程序检测到错误时才写入文件的线程本地环形缓冲区。 Thus the server.log file remains small but the developers get enough debug messages to analyze problems later.因此 server.log 文件仍然很小，但开发人员可以获得足够的调试消息来分析问题。 If not problem occurs, then both files are totally empty, and thus do not harm the performance.如果没有出现问题，那么这两个文件都是完全空的，因此不会损害性能。 Of course, the buffers cost memory and a little CPU power, but that can be accepted.当然，缓冲区会消耗内存和一点 CPU 能力，但这是可以接受的。

This is an example implementation which I am using in Odoo (which is a Python application):这是我在 Odoo（这是一个 Python 应用程序）中使用的示例实现：

import logging, collections, time

class LogBuffer(logging.Handler):
    """Buffer debug messages per thread and write them out when an error (or warning) occurs"""

    def __init__(self, target_handler, threshold, max_buffered_messages, max_buffer_seconds):
        logging.Handler.__init__(self, logging.DEBUG)
        self.tread_buffers = dict()  # stores one buffer for each thread (key=thread number)
        self.target_handler = target_handler
        self.threshold = threshold
        self.max_buffered_messages = max_buffered_messages
        self.last_check_time = time.time()
        self.max_buffer_seconds = max_buffer_seconds

    def emit(self, record):
        """Do whatever it takes to actually log the specified logging record."""

        # Create a thread local buffer, if not already exists
        if record.thread not in self.tread_buffers:
            thread_buffer = self.tread_buffers[record.thread] = collections.deque()
        else:
            thread_buffer = self.tread_buffers[record.thread]

        # Put the log record into the buffer
        thread_buffer.append(record)

        # If the buffer became to large, then remove the oldest entry
        if len(thread_buffer) > self.max_buffered_messages:
            thread_buffer.popleft()

        # produce output if the log level is high enough
        if record.levelno >= self.threshold:
            for r in thread_buffer:
                self.target_handler.emit(r)
            thread_buffer.clear()

        # remove very old messages from all buffers once per minute
        now = time.time()
        elapsed = now - self.last_check_time
        if elapsed > 60:
            # Iterate over all buffers
            for key, buffer in list(self.tread_buffers.items()):
                # Iterate over the content of one buffer
                for r in list(buffer):
                    age = now - r.created
                    if age > self.max_buffer_seconds:
                        buffer.remove(r)
                # If the buffer is now empty, then remove it
                if not buffer:
                    del self.tread_buffers[key]
            self.last_check_time = now

Example how to create/configure such a logger:如何创建/配置这样的记录器的示例：

import logging
from . import logbuffer

"""
    Possible placeholders for the formatter:

    %(name)s            Name of the logger (logging channel)
    %(levelno)s         Numeric logging level for the message (DEBUG, INFO,
                        WARNING, ERROR, CRITICAL)
    %(levelname)s       Text logging level for the message ("DEBUG", "INFO",
                        "WARNING", "ERROR", "CRITICAL")
    %(pathname)s        Full pathname of the source file where the logging
                        call was issued (if available)
    %(filename)s        Filename portion of pathname
    %(module)s          Module (name portion of filename)
    %(lineno)d          Source line number where the logging call was issued
                        (if available)
    %(funcName)s        Function name
    %(created)f         Time when the LogRecord was created (time.time()
                        return value)
    %(asctime)s         Textual time when the LogRecord was created
    %(msecs)d           Millisecond portion of the creation time
    %(relativeCreated)d Time in milliseconds when the LogRecord was created,
                        relative to the time the logging module was loaded
                        (typically at application startup time)
    %(thread)d          Thread ID (if available)
    %(threadName)s      Thread name (if available)
    %(process)d         Process ID (if available)
    %(message)s         The result of record.getMessage(), computed just as
                        the record is emitted
"""

# Log levels are: CRITICAL, ERROR, WARNING, INFO, DEBUG

# Specify the output format
formatter = logging.Formatter('%(asctime)-15s %(thread)20d %(levelname)-8s %(name)s %(message)s')

# Create server.log
server_log = logging.FileHandler('../log/server.log')
server_log.setLevel(logging.INFO)
server_log.setFormatter(formatter)
logging.root.addHandler(server_log)

# Create debug.log
debug_log = logging.FileHandler('../log/debug.log')
debug_log.setFormatter(formatter)
memory_handler = logbuffer.LogBuffer(debug_log, threshold=logging.ERROR, max_buffered_messages=100, max_buffer_seconds=600)
logging.root.addHandler(memory_handler)

# Specify log levels for individual packages
logging.getLogger('odoo.addons').setLevel(logging.DEBUG)

# The default log level for all other packages
logging.root.setLevel(logging.INFO)

Please let me know if you find this helpful.如果您觉得这有帮助，请告诉我。 Im on a very beginner level regarding Python, but I have the same thing in Java and C++ already running successfully for years.我在 Python 方面处于非常初级的水平，但我在 Java 和 C++ 中已经成功运行了多年。

Python 日志与性能

问题描述

2 个解决方案

解决方案1
4 已采纳 2015-11-15 05:27:16

解决方案2
1 2020-02-06 17:25:09

Python 日志与性能

问题描述

2 个解决方案

解决方案1 4 已采纳 2015-11-15 05:27:16

解决方案2 1 2020-02-06 17:25:09

解决方案1
4 已采纳 2015-11-15 05:27:16

解决方案2
1 2020-02-06 17:25:09