简体   繁体   English

python多处理池和日志记录

[英]python multiprocessing pool and logging

My application uses multiprocessing.pool to parallelize the calculations. 我的应用程序使用multiprocessing.pool并行计算。 Now I would like to add logging feature. 现在,我想添加日志记录功能。 The code (unfortunately) needs to run on Windows. 该代码(不幸的是)需要在Windows上运行。 I found one related post on stackoverflow , but it not working. 我在stackoverflow上找到了一篇相关的文章,但是它不起作用。 I suppose the package multiprocessing_logging doesn't support pools. 我想软件包multiprocessing_logging不支持池。

Here is my code: 这是我的代码:

from multiprocessing_logging import install_mp_handler

def main(): # main function
    filename = "XXX" + datetime.datetime.now().strftime('%Y-%m-%d-%H.%M.%S') + ".log"

    log_file = os.path.abspath(os.path.join('logs',filename))
    multiprocessing.freeze_support() # support multiprocessing

    logging.basicConfig(filename=log_file,
                        filemode='a',
                        format='%(asctime)s:%(msecs)d (%(processName)s) %(levelname)s %(name)s \t %(message)s',
                        datefmt='%H:%M:%S',
                        level=logging.DEBUG)

    logger.info("Start application")

def run(): # main exection
    logger.info("Generate outputs for every metrics")
    num_cores = multiprocessing.cpu_count()
    logger.info("Output Generation execute on " + str(num_cores) + " cores" )

    pool = Pool(num_cores, initializer=install_mp_handler )
    processed_metrics = pool.map(_generate_outputs, metrics_list)
    pool.close()
    pool.join()
    map(_create_report,processed_metrics)

The implementation of helper function _generate_outputs and _create_report are irrelevant to the problem. 辅助函数_generate_outputs_create_report的实现与该问题无关。 When I execute the code the logs generated by the modules from the main process are correctly stored, but not from the child processes. 当我执行代码时,正确存储了由主进程生成的模块日志,而不是从子进程生成的日志。

[EDIT] [编辑]
I changed my code according to the comments. 我根据注释更改了代码。 Now, my code looks like this: 现在,我的代码如下所示:

    num_cores = multiprocessing.cpu_count()
    logger.info("Output Generation execute on " + str(num_cores) + " cores" )
    install_mp_handler()
    pool = Pool(num_cores, initializer=install_mp_handler )
    processed_metrics = pool.map(_generate_outputs, metrics_list)
    pool.close()
    pool.join()
    map(_create_report,processed_metrics)

But still, logs from child processes are not captured. 但是仍然无法捕获子进程的日志。 After program termination I see an error: 程序终止后,我看到一个错误:

Traceback (most recent call last):
  File "C:\Python27\lib\site-packages\multiprocessing_logging.py", line 64, in _receive
    record = self.queue.get(timeout=0.2)
  File "C:\Python27\lib\multiprocessing\queues.py", line 131, in get
    if not self._poll(timeout):
IOError: [Errno 109] The pipe has been ended
Exception in thread mp-handler-0:
Traceback (most recent call last):
  File "C:\Python27\lib\threading.py", line 801, in __bootstrap_inner
    self.run()
  File "C:\Python27\lib\threading.py", line 754, in run
    self.__target(*self.__args, **self.__kwargs)
  File "C:\Python27\lib\site-packages\multiprocessing_logging.py", line 62, in _receive
    while not (self._is_closed and self.queue.empty()):
  File "C:\Python27\lib\multiprocessing\queues.py", line 146, in empty
    return not self._poll()
IOError: [Errno 109] The pipe has been ended

The key requirements are the program needs to work on Windows. 关键要求是程序需要在Windows上运行。

You need to call the install_mp_handler() before the Pool() instantiation. 您需要在Pool()实例化之前调用install_mp_handler()

...
install_mp_handler()
pool = Pool(num_cores, initializer=install_mp_handler)
...

In the end it all comes down to the logging being transferred over a queue to a centralized log handler, take a look into the multiprocessing_logging.py it gives a clear understanding of the technique. 最后,一切归结为将日志记录通过队列传输到集中式日志处理程序,然后查看multiprocessing_logging.py ,可以清楚地了解该技术。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM