Python：如何为 multiprocessing.Pool 中的进程使用不同的日志文件？

Question

I am using multiprocessing.Pool to run a number of independent processes in parallel.我正在使用multiprocessing.Pool并行运行多个独立进程。 Not so much different from the basic example in the python docs:与 python 文档中的基本示例没有太大区别：

from multiprocessing import Pool

def f(x):
    return x*x

if __name__ == '__main__':
    with Pool(5) as p:
        print(p.map(f, [1, 2, 3]))

I would like each process to have a separate log file.我希望每个进程都有一个单独的日志文件。 I log various info from other modules in my codebase and some third-party packages (none of them is multiprocessing aware).我在我的代码库和一些第三方包中记录了来自其他模块的各种信息（它们都不支持多处理）。 So, for example, I would like this:所以，例如，我想要这样：

import logging
from multiprocessing import Pool

def f(x):
    logging.info(f"x*x={x*x}")
    return x*x

if __name__ == '__main__':
    with Pool(5) as p:
        print(p.map(f, range(10)))

to write on disk:在磁盘上写：

log1.log
log2.log
log3.log
log4.log
log5.log

How do I achieve it?我如何实现它？

Answer 1

You'll need to use Pool's initializer() to set up and register the separate loggers immediately after workers start up.您需要在工作人员启动后立即使用 Pool 的initializer()设置和注册单独的记录器。 Under the hood the arguments to Pool(initializer) and Pool(initargs) end up being passed to Process(target) and Process(args) for creating new worker-processes...在后台， Pool(initializer)和Pool(initargs)的参数最终被传递给Process(target)和Process(args)以创建新的工作进程......

Pool-workers get named in the format {start_method}PoolWorker-{number}, so eg SpawnWorker-1 if you use spawn as starting method for new processes. Pool-workers 以 {start_method}PoolWorker-{number} 的格式命名，例如SpawnWorker-1 ，如果您使用 spawn 作为新进程的启动方法。 The file number for the logfiles then can be extracted from the assigned worker-names with mp.current_process().name.split('-')[1] .然后可以使用mp.current_process().name.split('-')[1]从分配的工作人员名称中提取日志文件的文件编号。

import logging
import multiprocessing as mp


def f(x):
    logger.info(f"x*x={x*x}")
    return x*x


def _init_logging(level=logging.INFO, mode='a'):
    worker_no = mp.current_process().name.split('-')[1]
    filename = f"log{worker_no}.log"
    fh = logging.FileHandler(filename, mode=mode)
    fmt = logging.Formatter(
        '%(asctime)s %(processName)-10s %(name)s %(levelname)-8s --- %(message)s'
    )
    fh.setFormatter(fmt)
    logger = logging.getLogger()
    logger.addHandler(fh)
    logger.setLevel(level)
    globals()['logger'] = logger


if __name__ == '__main__':

    with mp.Pool(5, initializer=_init_logging, initargs=(logging.DEBUG,)) as pool:
        print(pool.map(f, range(10)))

Note, due to the nature of multiprocessing, there's no guarantee for the exact number of files you end up with in your small example.请注意，由于多处理的性质，无法保证您在小示例中最终得到的文件的确切数量。 Since multiprocessing.Pool (contrary to concurrent.futures.ProcessPoolExecutor ) starts workers as soon as you create the instance, you're bound to get the specified Pool(process) -number of files, so in your case 5. Actual thread/process-scheduling by your OS might cut this number short here, though.由于multiprocessing.Pool （与concurrent.futures.ProcessPoolExecutor相反）在您创建实例后立即启动工作人员，因此您一定会获得指定的Pool(process)文件数，因此在您的情况下为 5. 实际线程/进程- 不过，由您的操作系统安排可能会缩短这个数字。

Python：如何为 multiprocessing.Pool 中的进程使用不同的日志文件？

问题描述

1 个解决方案

解决方案1
2 已采纳 2022-05-28 02:49:22

Python：如何为 multiprocessing.Pool 中的进程使用不同的日志文件？

问题描述

1 个解决方案

解决方案1 2 已采纳 2022-05-28 02:49:22

解决方案1
2 已采纳 2022-05-28 02:49:22