[英]Python multiprocessing Queue + Process hanging
我正在尝试使用multiprocessing.Queue
来管理一些由主进程发送并由“工作”进程( multiprocessing.Process
)拾取的任务。 然后工作人员运行任务并将结果放入结果队列。
这是我的主要脚本:
from multiprocessing import Process, Queue, freeze_support
import auxiliaries as aux
import functions
if __name__ == '__main__':
freeze_support()
start = time.perf_counter()
# number of processess
nprocs = 3
# define the tasks
tasks = [(functions.get_stats_from_uniform_dist, (2**23, i)) for i in range(600)]
# start the queues
task_queue = Queue()
result_queue = Queue()
# populate task queue
for task in tasks:
task_queue.put(task)
# after all tasks are in the queue, send a message to stop picking...
for _ in range(nprocs):
task_queue.put('STOP')
# start workers
procs = []
for _ in range(nprocs):
p = Process(target=aux.worker, args=(task_queue, result_queue))
p.start()
procs.append(p)
for p in procs:
p.join()
# print what's in the result queue
while not result_queue.empty():
print(result_queue.get())
导入的模块是
辅助工具.py
from multiprocessing import current_process
def calculate(func, args):
"""
Calculates a certain function for a list of arguments. Returns a string with the result.
Arguments:
- func (string): function name
- args (list): list of arguments
"""
result = func(*args)
string = current_process().name
string = string + " says " + func.__name__ + str(args)
string = string + " = " + str(result)
return string
def worker(inputQueue, outputQueue):
"""
Picks up work from the inputQueue and outputs result to outputQueue.
Inputs:
- inputQueue (multiprocessing.Queue)
- outputQueue (multiprocessing.Queue)
"""
for func, args in iter(inputQueue.get, 'STOP'):
result = calculate(func, args)
outputQueue.put(result)
和functions.py
import numpy as np
def get_stats_from_uniform_dist(nDraws, seed):
"""
Calculates average and standard deviation of nDraws from NumPy's random.rand().
Arguments:
- nDraws (int): number of elements to draw
- seed (int): random number generator's seed
Returns:
- results (list): [average, std]
"""
np.random.seed(seed)
x = np.random.rand(nDraws)
return [x.mean(), x.std()]
这完全基于https://docs.python.org/3/library/multiprocessing.html#multiprocessing-examples
最多可处理 500 个任务,一切正常。 之后,代码挂起。 看起来其中一个过程永远不会完成,所以当我加入它们时它会卡住。 队列看起来并没有满员。 我怀疑其中一个进程没有在task_queue
中找到“STOP”条目,因此它一直在尝试.get()
永远,但我不明白这是如何发生的以及为什么会发生。 关于什么可能导致锁定的任何想法? 谢谢!
使用更高级别的Pool.imap_unordered()
方法可能会让您有更好的时间,它会为您完成所有这些。
from multiprocessing import Pool
def get_stats_from_uniform_dist(task):
nDraws, seed = task
# ...
if __name__ == '__main__':
with multiprocessing.Pool(nprocs) as p:
tasks = [(2**23, i) for i in range(600)]
results = list(p.imap_unordered(functions.get_stats_from_uniform_dist, tasks, chunksize=10))
问题在这里:
if __name__ == '__main__':
.
.
.
# start workers
procs = []
for _ in range(nprocs):
p = Process(target=aux.worker, args=(task_queue, result_queue))
p.start()
procs.append(p)
for p in procs:
p.join()
# print what's in the result queue
while not result_queue.empty():
print(result_queue.get())
或者,更具体地说,我在result_queue
耗尽之前加入进程的事实。 正如@Charchit 指出的,这实际上在文档中提到:
加入使用队列的进程
请记住,将项目放入队列的进程将在终止之前等待,直到所有缓冲的项目都由“馈送”线程馈送到底层 pipe。 (子进程可以调用队列的 Queue.cancel_join_thread 方法来避免这种行为。)
这意味着无论何时使用队列,您都需要确保所有已放入队列的项目最终都会在进程加入之前被删除。 否则,您无法确定将项目放入队列的进程将终止。 还要记住,非守护进程将自动加入。
我正在使用的解决方案是,以产生另一个进程为代价,使用Manager.Queue()
代替,如此处所建议的。 然后是主脚本
from multiprocessing import Process, freeze_support, Manager
import auxiliaries as aux
import functions
if __name__ == '__main__':
freeze_support()
# number of processess
nprocs = 3
# define the tasks
tasks = [(functions.get_stats_from_uniform_dist, (2**18, i)) for i in range(1000)]
# use a manager context to share queues between processes
manager = Manager()
task_queue = manager.Queue()
result_queue = manager.Queue()
# populate task queue
for task in tasks:
task_queue.put(task)
# after all tasks are in the queue, send a message to stop picking...
for _ in range(nprocs):
task_queue.put('STOP')
# start processes (workers)
procs = []
for _ in range(nprocs):
p = Process(target=aux.worker, args=(task_queue, result_queue))
p.start()
procs.append(p)
# wait until workers are done
for p in procs:
p.join()
# print what's in the result queue
while not result_queue.empty():
print(result_queue.get())
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.