Python Multiprocessing returning results with Logging and running frozen on Windows

Question

I need some help with implementing logging while multiprocessing and running the application frozen under Windows. There are dozens of topics on this subject and I have spent a lot of time reviewing and testing those. I have also extensively reviewed the documentation , but I cannot figure out how to implement this in my code.

I have created a minimum example which runs fine on Linux, but crashes on Windows (even when not frozen). The example I created is just one of many iterations I have put my code through.

You can find the minimum example on github . Any assistance to get this example working would be greatly appreciated.

Thank you.

Marc.

Answer 1

The basic

On Linux, a child process is created by fork method by default. That means, the child process inherits almost everything from the parent process.

On Windows, the child process is created by spawn method. That means, a child process is started almost from crash, re-imports and re-executes any code that is outside of the guard cloud if __name__ == '__main__' .

Why it worked or failed

On Linux, because the logger object is inherited, your program will start logging. But it is far from perfect since you log directly to the file. Sooner or later, log lines will be overlapped or IO error on file happens due to race condition between processes.

On Windows, since you didn't pass the logger object to the child process, and it re-imports your pymp_global module, logger is a None object. So when you try logging with a None object, it crashes for sure.

The solution

Logging with multiprocessing is not an easy task. For it to work on Windows, you must either pass a logger object to child processes and/or log with QueueHandler . Another similar solution for inter-process communication is to use SocketHandler .

The idea is that only one thread or process does the logging. Other processes just send the log records. This prevents the race condition and ensures the log is written out after the critical process got time to do its job.

So how to implement it?
I have encountered this logging problem before and already written the code.
You can just use it with logger-tt package.

#pymp.py
from logging import getLogger
from logger_tt import setup_logging    

setup_logging(use_multiprocessing=True)
logger = getLogger(__name__)
# other code below

For other modules

#pymp_common.py
from logging import getLogger

logger = getLogger(__name__)
# other code below

This saves you from writing all the logging config code everywhere manually. You may consider changing the log_config file to suit your need.

Python Multiprocessing returning results with Logging and running frozen on Windows

Question

1 answers

solution1
0 ACCPTED 2020-10-14 11:52:36

The basic

Why it worked or failed

The solution

Python Multiprocessing returning results with Logging and running frozen on Windows

Question

1 answers

solution1 0 ACCPTED 2020-10-14 11:52:36

The basic

Why it worked or failed

The solution

solution1
0 ACCPTED 2020-10-14 11:52:36