简体   繁体   English

Python Multiprocessing 在 Windows 上使用 Logging 和运行冻结返回结果

[英]Python Multiprocessing returning results with Logging and running frozen on Windows

I need some help with implementing logging while multiprocessing and running the application frozen under Windows.我需要一些帮助来在多处理和运行 Windows 下冻结的应用程序时实现日志记录。 There are dozens of topics on this subject and I have spent a lot of time reviewing and testing those.关于这个主题有几十个主题,我花了很多时间来回顾和测试它们。 I have also extensively reviewed the documentation , but I cannot figure out how to implement this in my code.我还广泛审查了文档,但我无法弄清楚如何在我的代码中实现这一点。

I have created a minimum example which runs fine on Linux, but crashes on Windows (even when not frozen).我创建了一个最小示例,它在 Linux 上运行良好,但在 Windows 上崩溃(即使没有冻结)。 The example I created is just one of many iterations I have put my code through.我创建的示例只是我对代码进行的许多迭代之一。

You can find the minimum example on github .您可以在 github 上找到最小示例 Any assistance to get this example working would be greatly appreciated.任何使此示例工作的帮助将不胜感激。

Thank you.谢谢你。

Marc.马克。

The basic基础的

On Linux, a child process is created by fork method by default.在 Linux 上,默认情况下通过fork方法创建子进程。 That means, the child process inherits almost everything from the parent process.这意味着,子进程几乎继承了父进程的所有内容。

On Windows, the child process is created by spawn method.在 Windows 上,子进程是通过spawn方法创建的。 That means, a child process is started almost from crash, re-imports and re-executes any code that is outside of the guard cloud if __name__ == '__main__' .这意味着,子进程几乎从崩溃开始, if __name__ == '__main__' ,则重新导入并重新执行保护云之外的任何代码。

Why it worked or failed为什么它有效或失败

On Linux, because the logger object is inherited, your program will start logging.在 Linux 上,由于logger对象是继承的,您的程序将开始记录。 But it is far from perfect since you log directly to the file.但它远非完美,因为您直接登录到文件。 Sooner or later, log lines will be overlapped or IO error on file happens due to race condition between processes.由于进程之间的竞争条件,迟早会发生日志行重叠或文件上的IO错误。

On Windows, since you didn't pass the logger object to the child process, and it re-imports your pymp_global module, logger is a None object.在 Windows 上,由于您没有将logger对象传递给子进程,并且它重新导入了您的pymp_global模块,因此logger是一个None对象。 So when you try logging with a None object, it crashes for sure.因此,当您尝试使用None对象进行日志记录时,它肯定会崩溃。

The solution解决方案

Logging with multiprocessing is not an easy task.使用多处理进行日志记录并不是一件容易的事。 For it to work on Windows, you must either pass a logger object to child processes and/or log with QueueHandler .要使其在 Windows 上工作,您必须将记录器对象传递给子进程和/或使用QueueHandler记录。 Another similar solution for inter-process communication is to use SocketHandler .另一个类似的进程间通信解决方案是使用SocketHandler

The idea is that only one thread or process does the logging.这个想法是只有一个线程或进程进行日志记录。 Other processes just send the log records.其他进程只发送日志记录。 This prevents the race condition and ensures the log is written out after the critical process got time to do its job.这可以防止竞争条件并确保在关键进程有时间完成其工作后写出日志。

So how to implement it?那么如何实施呢?
I have encountered this logging problem before and already written the code.我之前遇到过这个日志问题并且已经编写了代码。
You can just use it with logger-tt package.您可以将它与logger-tt包一起使用。

#pymp.py
from logging import getLogger
from logger_tt import setup_logging    

setup_logging(use_multiprocessing=True)
logger = getLogger(__name__)
# other code below

For other modules对于其他模块

#pymp_common.py
from logging import getLogger

logger = getLogger(__name__)
# other code below

This saves you from writing all the logging config code everywhere manually.这使您无需在任何地方手动编写所有日志记录配置代码。 You may consider changing the log_config file to suit your need.您可以考虑更改log_config文件以满足您的需要。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM