简体   繁体   中英

Python: How to use different logfiles for processes in multiprocessing.Pool?

I am using multiprocessing.Pool to run a number of independent processes in parallel. Not so much different from the basic example in the python docs:

from multiprocessing import Pool

def f(x):
    return x*x

if __name__ == '__main__':
    with Pool(5) as p:
        print(p.map(f, [1, 2, 3]))

I would like each process to have a separate log file. I log various info from other modules in my codebase and some third-party packages (none of them is multiprocessing aware). So, for example, I would like this:

import logging
from multiprocessing import Pool

def f(x):
    logging.info(f"x*x={x*x}")
    return x*x

if __name__ == '__main__':
    with Pool(5) as p:
        print(p.map(f, range(10)))

to write on disk:

log1.log
log2.log
log3.log
log4.log
log5.log

How do I achieve it?

You'll need to use Pool's initializer() to set up and register the separate loggers immediately after workers start up. Under the hood the arguments to Pool(initializer) and Pool(initargs) end up being passed to Process(target) and Process(args) for creating new worker-processes...

Pool-workers get named in the format {start_method}PoolWorker-{number}, so eg SpawnWorker-1 if you use spawn as starting method for new processes. The file number for the logfiles then can be extracted from the assigned worker-names with mp.current_process().name.split('-')[1] .

import logging
import multiprocessing as mp


def f(x):
    logger.info(f"x*x={x*x}")
    return x*x


def _init_logging(level=logging.INFO, mode='a'):
    worker_no = mp.current_process().name.split('-')[1]
    filename = f"log{worker_no}.log"
    fh = logging.FileHandler(filename, mode=mode)
    fmt = logging.Formatter(
        '%(asctime)s %(processName)-10s %(name)s %(levelname)-8s --- %(message)s'
    )
    fh.setFormatter(fmt)
    logger = logging.getLogger()
    logger.addHandler(fh)
    logger.setLevel(level)
    globals()['logger'] = logger


if __name__ == '__main__':

    with mp.Pool(5, initializer=_init_logging, initargs=(logging.DEBUG,)) as pool:
        print(pool.map(f, range(10)))

Note, due to the nature of multiprocessing, there's no guarantee for the exact number of files you end up with in your small example. Since multiprocessing.Pool (contrary to concurrent.futures.ProcessPoolExecutor ) starts workers as soon as you create the instance, you're bound to get the specified Pool(process) -number of files, so in your case 5. Actual thread/process-scheduling by your OS might cut this number short here, though.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM