简体   繁体   中英

Do I need to pass multiprocessing queues to the process?

The following multiprocessing example works on my Ubuntu machine. It starts a process, sends a parameter via a queue, and receives the result of the computation via another queue:

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import multiprocessing
import Queue


running = multiprocessing.Value('i', 1)
request = multiprocessing.Queue(1)
response = multiprocessing.Queue(1)

def worker():
  while running.value:
    try:
      param = request.get(timeout=0.1)
    except Queue.Empty:
      # To check running flag.
      continue
    # Imagine heavy computation here.
    result = param ** 2
    response.put_nowait(result)


def main():
  process = multiprocessing.Process(target=worker)
  process.start()
  request.put_nowait(42)
  result = response.get()
  print('Result', result)
  running.value = 0
  process.join()


if __name__ == '__main__':
  main()

However, several examples on the web seem to pass all objects needed by the worker via multiprocessing.Process(target=worker, args=(running, request, response) . Is this necessary for any reason, for example platform compatibility?

People tend to follow the multiprocessing programming guidelines :

Better to inherit than pickle/unpickle

On Windows many types from multiprocessing need to be picklable so that child processes can use them. However, one should generally avoid sending shared objects to other processes using pipes or queues. Instead you should arrange the program so that a process which needs access to a shared resource created elsewhere can inherit it from an ancestor process.

If your implementation is simple enough, you can make use of global variables. Yet in more complex cases, you probably want to avoid them and prefer a better encapsulation.

Moreover, your implementation is likely not to work on Windows due to the way the OS handles the creation of new processes .

Unix uses fork which duplicates the parent process resources. Therefore, the child inherits the parent opened files (in your case the Queue ).

Windows uses the spawn method which, instead, creates a "blank" process, loads a new Python interpreter and tries to re-construct the minimum necessary to run the target function. Is very likely that the new process will have a brand new Queue which is different from the parent one. Hence the data you will send will never reach the child process.

Note on the last statement: Python multiprocessing library tries to provide an OS agnostic experience (which I personally dislike). This means that your code might still be working on Windows due to this effort.

As the actual differences between fork and spawn are not well documented, it's always recommended to follow the programming guidelines to avoid weird behaviours.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM