简体   繁体   中英

How to create a child process using multiprocessing in Python2.7.10 without the child sharing resources with parent?

We are trying to move our python 2.7.10 codebase from Windows to Linux. We recently discovered that multiprocessing library in Python 2.7 behaves differently on Windows vs Linux. We have found many articles like this one describing the problem however, we are unable to find a solution online for Python 2.7. This is a fix for this issue in Python 3.4 however, we are unable to upgrade to Python 3.4. Is there any way to use multiprocessing in Python 2.7 on Linux without the child and parent sharing memory? We can also use guidance on modifying forking.py code in python 2.7 to ensure child and parent process aren't sharing memory and doing Copy-on-Write. Thanks!

A possible solution is to use loky , a library which provides an implementation of Process with fork-exec in python2.7 . The fork-exec start method behaves similarly to spawn, with a fresh interpreter in the newly spawned process. The library is mainly designed to provide a concurrent.futures API but you can use mp = loky.backend.get_context() to get the same API as multiprocessing .

from loky.backend import get_context
import multiprocessing as mp


def child_without_os():
    print("Hello from {}".format(os.getpid()))


def child_with_os():
    import os
    print("Hello from {}".format(os.getpid()))


if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser('Test loky backend')
    parser.add_argument('--use-fork', action="store_true",
                        help="Use start_method='fork' instead of 'loky'")
    parser.add_argument('--with-os', action="store_true",
                        help='Import os module in the child interpreter')
    args = parser.parse_args()

    # Only import os in the main module, this should fail if the interpreter is
    # not shared
    import os
    print("Main is {}".format(os.getpid()))
    if args.use_fork:
        ctx = mp
        print("Using fork context")
    else:
        ctx = get_context('loky_init_main')
        print("Using loky context")

    if args.with_os:
        target = child_with_os
    else:
        target = child_without_os

    p = ctx.Process(target=target)
    p.start()
    p.join()

This gives

# Use the default context, the child process has a copy-on-write interpreter
# state and can use the os module.
$ python2 test.py --use-fork
Main is 14630
Using fork context
Hello from 14633


# Use the loky context, the child process has a fresh interpreter
# state and need to import the os module.
$ python2 test.py
Main is 14661
Using loky context
Process LokyInitMainProcess-1:
Traceback (most recent call last):
  File "/usr/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap
    self.run()
  File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "/home/tom/Work/prog/loky/test.py", line 6, in child_without_os
    print("Hello from {}".format(os.getpid()))
NameError: global name 'os' is not defined


# Now using the correct child function which import the os module
$ python2 test.py --with-os
Main is 14700
Using loky context
Hello from 14705

(DISCLAIMER: I am one of the maintainers of loky ).

As you're no doubt aware, the patches in the CPython bug tracker don't apply cleanly to Python 2.7's version of multiprocessing, and the patches include some extra features for semaphore.c so that semaphores are cleaned up properly afterwards.

I think your best bet would be to backport the multiprocessing module from Python 3. Copy the Python code over, rename it to just processing , discover the missing C features and work around them (eg clean up your own semaphores or don't use them). Although the library is big it may be straightforward to port only the features that you use. If you are able to publish the backport I'm sure many people would be interested in that project.

Depending on how heavily you rely on multiprocessing, a different option would be to just run more Pythons by running sys.executable with the subprocess module.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM