繁体   English   中英

Python 多处理队列+进程挂起

[英]Python multiprocessing Queue + Process hanging

我正在尝试使用multiprocessing.Queue来管理一些由主进程发送并由“工作”进程( multiprocessing.Process )拾取的任务。 然后工作人员运行任务并将结果放入结果队列。

这是我的主要脚本:

from multiprocessing import Process, Queue, freeze_support
import auxiliaries as aux
import functions

if __name__ == '__main__':
    freeze_support()
    
    start = time.perf_counter()
    # number of processess
    nprocs = 3

    # define the tasks
    tasks = [(functions.get_stats_from_uniform_dist, (2**23, i)) for i in range(600)]

    # start the queues
    task_queue = Queue()
    result_queue = Queue()

    # populate task queue
    for task in tasks:
        task_queue.put(task)

    # after all tasks are in the queue, send a message to stop picking...
    for _ in range(nprocs):
        task_queue.put('STOP')

    # start workers 
    procs = []
    for _ in range(nprocs):
        p = Process(target=aux.worker, args=(task_queue, result_queue))
        p.start()
        procs.append(p)

    for p in procs:
        p.join()

    # print what's in the result queue
    while not result_queue.empty():
        print(result_queue.get())

导入的模块是

辅助工具.py

from multiprocessing import current_process

def calculate(func, args):
    """
    Calculates a certain function for a list of arguments. Returns a string with the result.

    Arguments:
        - func (string): function name
        - args (list): list of arguments
    """
    result = func(*args)
    string = current_process().name
    string = string + " says " + func.__name__ + str(args)
    string = string + " = " + str(result)
    return string


def worker(inputQueue, outputQueue):
    """
    Picks up work from the inputQueue and outputs result to outputQueue.

    Inputs:
        - inputQueue (multiprocessing.Queue)
        - outputQueue (multiprocessing.Queue)
    """
    for func, args in iter(inputQueue.get, 'STOP'):
        result = calculate(func, args)
        outputQueue.put(result)

functions.py

import numpy as np

def get_stats_from_uniform_dist(nDraws, seed):
    """
    Calculates average and standard deviation of nDraws from NumPy's random.rand().

    Arguments:
        - nDraws (int): number of elements to draw
        - seed (int): random number generator's seed

    Returns:
        - results (list): [average, std]
    """
    np.random.seed(seed)
    x = np.random.rand(nDraws)
    return [x.mean(), x.std()]

这完全基于https://docs.python.org/3/library/multiprocessing.html#multiprocessing-examples

最多可处理 500 个任务,一切正常。 之后,代码挂起。 看起来其中一个过程永远不会完成,所以当我加入它们时它会卡住。 队列看起来并没有满员。 我怀疑其中一个进程没有在task_queue中找到“STOP”条目,因此它一直在尝试.get()永远,但我不明白这是如何发生的以及为什么会发生。 关于什么可能导致锁定的任何想法? 谢谢!

使用更高级别的Pool.imap_unordered()方法可能会让您有更好的时间,它会为您完成所有这些。

from multiprocessing import Pool

def get_stats_from_uniform_dist(task):
    nDraws, seed = task
    # ...

if __name__ == '__main__':
    with multiprocessing.Pool(nprocs) as p:
        tasks = [(2**23, i) for i in range(600)]
        results = list(p.imap_unordered(functions.get_stats_from_uniform_dist, tasks, chunksize=10))

问题在这里:

if __name__ == '__main__':
    .
    .
    .

    # start workers 
    procs = []
    for _ in range(nprocs):
        p = Process(target=aux.worker, args=(task_queue, result_queue))
        p.start()
        procs.append(p)

    for p in procs:
        p.join()

    # print what's in the result queue
    while not result_queue.empty():
        print(result_queue.get())

或者,更具体地说,我在result_queue耗尽之前加入进程的事实。 正如@Charchit 指出的,这实际上在文档中提到:

加入使用队列的进程

请记住,将项目放入队列的进程将在终止之前等待,直到所有缓冲的项目都由“馈送”线程馈送到底层 pipe。 (子进程可以调用队列的 Queue.cancel_join_thread 方法来避免这种行为。)

这意味着无论何时使用队列,您都需要确保所有已放入队列的项目最终都会在进程加入之前被删除。 否则,您无法确定将项目放入队列的进程将终止。 还要记住,非守护进程将自动加入。

我正在使用的解决方案是,以产生另一个进程为代价,使用Manager.Queue()代替,如此处所建议 然后是主脚本

from multiprocessing import Process, freeze_support, Manager
import auxiliaries as aux
import functions

if __name__ == '__main__':
    freeze_support()
    
    # number of processess
    nprocs = 3

    # define the tasks
    tasks = [(functions.get_stats_from_uniform_dist, (2**18, i)) for i in range(1000)]

    # use a manager context to share queues between processes
    manager = Manager()
    task_queue = manager.Queue()
    result_queue = manager.Queue()

    # populate task queue
    for task in tasks:
        task_queue.put(task)

    # after all tasks are in the queue, send a message to stop picking...
    for _ in range(nprocs):
        task_queue.put('STOP')

    # start processes (workers)
    procs = []
    for _ in range(nprocs):
        p = Process(target=aux.worker, args=(task_queue, result_queue))
        p.start()
        procs.append(p)

    # wait until workers are done
    for p in procs:
        p.join()

    # print what's in the result queue
    while not result_queue.empty():
        print(result_queue.get())

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM