简体   繁体   中英

Multiple queues from one multiprocessing Manager

I'm writing a script that will use python's multiprocessing and threading module. For your understanding, I spawn as much processes as cores are available and inside each process I start eg 25 threads. Each thread consumes from an input_queue and produces to an output_queue . For the queue object I use multiprocessing.Queue .

After my first tests I got a deadlock because the the thread responsible to feed and flush the Queue was hanging. After a while I found that I can use Queue().cancel_join_thread() to work around this problem.

But because of the possibility of data loss I would like to use: multiprocessing.Manager().Queue()

Now actual question: Is it better to use one manager object for each queue? Or should I create one manager and get two quese from the same manager object?

# One manager for all queues
import multiprocessing

manager = multiprocessing.Manager()
input_queue = manager.Queue()
output_queue = manager.Queue()

...Magic...

# As much managers as queues
manager_in = multiprocessing.Manager()
queue_in = manager_in.Queue()

manager_out = multiprocessing.Manager()
queue_out = manager_out.Queue()

...Magic...

Thank you for your help.

There is no need to use two separate Manager objects. As you have already seen the Manager object allows sharing objects among multiple processes; from the docs :

Managers provide a way to create data which can be shared between different processes. A manager object controls a server process which manages shared objects. Other processes can access the shared objects by using proxies.

Therefore if you have two different queues you can still use the same manager. In case it helps someone, here is a simple example using two queues with one manager:

from multiprocessing import Manager, Process
import time


class Worker(Process):
    """
    Simple worker.
    """

     def __init__(self, name, in_queue, out_queue):
        super(Worker, self).__init__()
        self.name = name
        self.in_queue = in_queue
        self.out_queue = out_queue

    def run(self):
        while True:
            # grab work; do something to it (+1); then put the result on the output queue
            work = self.in_queue.get()
            print("{} got {}".format(self.name, work))
            work += 1

            # sleep to allow the other workers a chance (b/c the work action is too simple)
            time.sleep(1)

            # put the transformed work on the queue
            print("{} puts {}".format(self.name, work))
            self.out_queue.put(work)


if __name__ == "__main__":
    # construct the queues
    manager = Manager()
    inq = manager.Queue()
    outq = manager.Queue()

    # construct the workers
    workers = [Worker(str(name), inq, outq) for name in range(3)]
    for worker in workers:
        worker.start()

    # add data to the queue for processing
    work_len = 10
    for x in range(work_len):
        inq.put(x)

    while outq.qsize() != work_len:
        # waiting for workers to finish
        print("Waiting for workers. Out queue size {}".format(outq.qsize()))
        time.sleep(1)

    # clean up
    for worker in workers:
        worker.terminate()

    # print the outputs
    while not outq.empty():
        print(outq.get())

Using two managers instead like so:

# construct the queues
manager1 = Manager()
inq = manager1.Queue()
manager2 = Manager()
outq = manager2.Queue()

works but there is no need.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM