简体   繁体   English

如何在基于fork的多处理过程中阻止python记录器和处理程序的继承?

[英]How can I prevent the inheritance of python loggers and handlers during multiprocessing based on fork?

Suppose I configured logging handlers in the main process. 假设我在主进程中配置了日志处理程序。 The main process spawns some children and due to os.fork() (in Linux) all loggers and handlers are inherited from the main process. 主进程产生一些子进程,由于os.fork() (在Linux中),所有记录器和处理程序都从主进程继承。 In the example below 'Hello World' would be printed 100 times to the console: 在下面的示例中, 'Hello World'将被打印100次到控制台:

import multiprocessing as mp
import logging


def do_log(no):
    # root logger logs Hello World to stderr (StreamHandler)
    # BUT I DON'T WANT THAT!
    logging.getLogger().info('Hello world {}'.format(no))


def main():
    format = '%(processName)-10s %(name)s %(levelname)-8s %(message)s'

    # This creates a StreamHandler
    logging.basicConfig(format=format, level=logging.INFO)

    n_cores = 4
    pool = mp.Pool(n_cores)
    # Log to stdout 100 times concurrently
    pool.map(do_log, range(100))
    pool.close()
    pool.join()


if __name__ == '__main__':
    main()

This will print something like: 这将打印如下:

ForkPoolWorker-1 root INFO     Hello world 0
ForkPoolWorker-3 root INFO     Hello world 14
ForkPoolWorker-3 root INFO     Hello world 15
ForkPoolWorker-3 root INFO     Hello world 16
...

However, I don't want the child process to inherit all the logging configuration from the parent. 但是,我不希望子进程从父进程继承所有日志记录配置。 So in the example above do_log should not print anything to stderr because there should be no StreamHandler . 因此在上面的示例中, do_log不应该向stderr打印任何内容,因为应该没有StreamHandler

How do I prevent inheriting the loggers and handlers without removing or deleting them in the original parent process? 如何在不删除或删除原始父进程的情况下阻止继承记录器和处理程序?


EDIT: Would it be a good idea to simply remove all handlers at the initialization of the pool? 编辑:在池的初始化时简单地删除所有处理程序是一个好主意吗?

def init_logging():
    for logger in logging.Logger.manager.loggerDict.values():
        if hasattr(logger, 'handlers'):
            logger.handlers = []

and

pool = mp.Pool(n_cores, initializer=init_logging, initargs=())

Moreover, can I also safely close() all (file) handlers during the initialization function? 此外,我还可以在初始化函数期间安全地close()所有(文件)处理程序吗?

You don't need to prevent it, you just need to reconfigure the logging hierarchy. 您无需阻止它,只需重新配置日志记录层次结构即可。

I think you're on the right track with the pool initializer. 我认为你正在使用池初始化器走在正确的轨道上。 But instead of trying to hack things, let the logging package do what it's designed to do. 但是,不要试图破解事物,而是让日志包完成它的设计目的。 Let the logging package do the reconfiguring of the logging hierarchy in the worker processes. 让日志包重新配置工作进程中的日志记录层次结构。

Here's an example: 这是一个例子:

def main():

    def configure_logging():
        logging_config = {
            'formatters': {
                'f': {
                    'format': '%(processName)-10s %(name)s'
                              ' %(levelname)-8s %(message)s',
                },
            },
            'handlers': {
                'h': {
                    'level':'INFO',
                    'class':'logging.StreamHandler',
                    'formatter':'f',
                },
            },
            'loggers': {
                '': {
                    'handlers': ['h'],
                    'level':'INFO',
                    'propagate': True,
                },
            },
            'version': 1,
        }

        pname = mp.current_process().name
        if pname != 'MainProcess':
            logging_config['handlers'] = {
                'h': {
                    'level':'INFO',
                    'formatter':'f',
                    'class':'logging.FileHandler',
                    'filename': pname + '.log',
                },
            }

        logging.config.dictConfig(logging_config)

    configure_logging() # MainProcess
    def pool_initializer():
        configure_logging()

    n_cores = 4
    pool = mp.Pool(n_cores, initializer=pool_initializer)
    pool.map(do_log, range(100))
    pool.close()
    pool.join()

Now, the worker processes will each log to their own individual log files, and will no longer use the main process's stderr StreamHandler. 现在,工作进程将各自记录到各自的日志文件中,并且将不再使用主进程的stderr StreamHandler。

The most straightforward answer is that you should probably avoid modifying globals with multiprocessing . 最直截了当的答案是你应该避免使用multiprocessing来修改全局变量。 Note that the root logger, which you get using logging.getLogger() , is global. 请注意,使用logging.getLogger()获得的根记录器是全局的。

The easiest way around this is simply creating a new logging.Logger instance for each process. 最简单的方法是为每个进程创建一个新的logging.Logger实例。 You can name them after the processes, or simply randomly: 您可以在流程后命名它们,或者只是随机命名:

log= logging.getLogger(str(uuid.uuid4()))

You may also want to check how should I log while using multiprocessing in python 您可能还想检查在python中使用多处理时应该如何记录

If you need to prevent the logging hierarchy from being inherited in the worker processes, simply do the logging configuration after creating the worker pool. 如果需要阻止在工作进程中继承日志记录层次结构,只需创建工作池执行日志记录配置。 From your example: 从你的例子:

pool = mp.Pool(n_cores)
logging.basicConfig(format=format, level=logging.INFO)

Then, nothing will be inherited. 然后,什么都不会被继承。

Otherwise, like you said, because of the os.fork(), things will get inherited/duplicated. 否则,就像你说的那样,因为os.fork(),事情会被继承/重复。 In this case, your options are reconfiguring logging after creating the pool (see my other answer), or other(s) suggestions/answers. 在这种情况下,您的选项是在创建池(请参阅我的其他答案)或其他(s)建议/答案后重新配置日志记录。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM