We are trying to move our python 2.7.10 codebase from Windows to Linux. We recently discovered that multiprocessing library in Python 2.7 behaves differently on Windows vs Linux. We have found many articles like this one describing the problem however, we are unable to find a solution online for Python 2.7. This is a fix for this issue in Python 3.4 however, we are unable to upgrade to Python 3.4. Is there any way to use multiprocessing in Python 2.7 on Linux without the child and parent sharing memory? We can also use guidance on modifying forking.py code in python 2.7 to ensure child and parent process aren't sharing memory and doing Copy-on-Write. Thanks!
A possible solution is to use loky
, a library which provides an implementation of Process
with fork-exec
in python2.7
. The fork-exec
start method behaves similarly to spawn, with a fresh interpreter in the newly spawned process. The library is mainly designed to provide a concurrent.futures
API but you can use mp = loky.backend.get_context()
to get the same API as multiprocessing
.
from loky.backend import get_context
import multiprocessing as mp
def child_without_os():
print("Hello from {}".format(os.getpid()))
def child_with_os():
import os
print("Hello from {}".format(os.getpid()))
if __name__ == "__main__":
import argparse
parser = argparse.ArgumentParser('Test loky backend')
parser.add_argument('--use-fork', action="store_true",
help="Use start_method='fork' instead of 'loky'")
parser.add_argument('--with-os', action="store_true",
help='Import os module in the child interpreter')
args = parser.parse_args()
# Only import os in the main module, this should fail if the interpreter is
# not shared
import os
print("Main is {}".format(os.getpid()))
if args.use_fork:
ctx = mp
print("Using fork context")
else:
ctx = get_context('loky_init_main')
print("Using loky context")
if args.with_os:
target = child_with_os
else:
target = child_without_os
p = ctx.Process(target=target)
p.start()
p.join()
This gives
# Use the default context, the child process has a copy-on-write interpreter
# state and can use the os module.
$ python2 test.py --use-fork
Main is 14630
Using fork context
Hello from 14633
# Use the loky context, the child process has a fresh interpreter
# state and need to import the os module.
$ python2 test.py
Main is 14661
Using loky context
Process LokyInitMainProcess-1:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/home/tom/Work/prog/loky/test.py", line 6, in child_without_os
print("Hello from {}".format(os.getpid()))
NameError: global name 'os' is not defined
# Now using the correct child function which import the os module
$ python2 test.py --with-os
Main is 14700
Using loky context
Hello from 14705
(DISCLAIMER: I am one of the maintainers of loky
).
As you're no doubt aware, the patches in the CPython bug tracker don't apply cleanly to Python 2.7's version of multiprocessing, and the patches include some extra features for semaphore.c
so that semaphores are cleaned up properly afterwards.
I think your best bet would be to backport the multiprocessing module from Python 3. Copy the Python code over, rename it to just processing
, discover the missing C features and work around them (eg clean up your own semaphores or don't use them). Although the library is big it may be straightforward to port only the features that you use. If you are able to publish the backport I'm sure many people would be interested in that project.
Depending on how heavily you rely on multiprocessing, a different option would be to just run more Pythons by running sys.executable
with the subprocess
module.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.