简体   繁体   中英

Multi-Core Python: multiprocessing Vs. zeroMQ?

I'd like to write python code that loads a data set as input and analyzes it.

There will be 5 parallel processes that will analyze the data -- each process will be processing the data differently -- in a unique way.

Before any processing will begin, the master script (the one that will 'start' all the different sub-processes) will define an empty list.

I'd like all the different processes to write their output to the same list mentioned above (meaning, each one of the processes will be able to directly manipulate the list that was defined in the master script).

Meaning, if process1 changed the first value of that list, all the other processes (while running) will see that the first value of that list has changed.


I get a sense that 2 different python modules can be used to solve this problem: multiprocessing and zeroMQ .

Are there any reasons to prefer one vs. the other in this case? does your answer change if -- instead of running the master script on the same server -- I will be splitting the processes between different (multiple) servers?

(if it at all matters, I am using a Linux )

You can't compare apples and oranges.

multiprocessing is a library to fork many processes.

zmq is a library that allows processes to use messages to communicate.

They do different jobs.

If these are your only two choices and you know for sure that you're going to be distributing your load across multiple machines, ZeroMQ is the only one of the two choices that fits the bill.

The Python multiprocessing module is for distributing load across processes/cores on a single machine. As far as I know, there is no networked protocol underlying the multiprocessing module and this is indicated by the first paragraph in the accompanying documentation .

ZeroMQ can be used for similar inter-process messaging on a single machine with its IPC protocol, but it also has network-based protocols that allows you to send messages between processes running on different machines as well.

That said, this question has the slight tinge of a an XY problem since you seem to have arbitrarily narrowed your choices to only two of the many, many possibilities for implementing a distributed program using Python.

Edit My answer here was incorrect and I can't delete an accepted answer, so converting it to a wiki in case anyone wants to correct it. Short story is I misread the documentation in haste. Python multiprocessing does support inter-process communication over a network boundary. One major difference with ZeroMQ is that ZeroMQ is designed to be platform agnostic so you could mix client/server agents on different platforms whereas Python multiprocessing is a batteries included option if client/server processes are coupled to Python.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM