简体   繁体   English

Python多处理队列失败

[英]Python multiprocessing Queue failure

I create 100 child processes 我创建了100个子进程

proc_list = [
    Process(target = simulator, args=(result_queue,))
    for i in xrange(100)]

and start them 并开始他们

for proc in proc_list: proc.start()

Each process puts into the result_queue (instance of multiprocessing.Queue) 10000 tuples after doing some processing. 在进行一些处理之后,每个进程都会将result_queue(multiprocessing.Queue的实例)10000个元组放入其中。

def simulate(alg_instance, image_ids, gamma, results,
                     simulations, sim_semaphore):
  (rs, qs, t_us) =  alg_instance.simulate_multiple(image_ids, gamma,
                                             simulations)
  all_tuples = zip(rs, qs, t_us)
  for result in all_tuples:
    results.put(result)
  sim_semaphore.release()

I should be (?) getting 1000000 tuples at the queue, but after various runs I get these (sample) sizes: 14912 19563 12952 13524 7487 18350 15986 11928 14281 14282 7317 我应该(?)在队列中获得1000000个元组,但经过各种运行后我得到这些(样本)大小:14912 19563 12952 13524 7487 18350 15986 11928 14281 14282 7317

Any suggestions? 有什么建议?

My solution to multiprocessing issues is almost always to use the Manager objects. 我对多处理问题的解决方案几乎总是使用Manager对象。 While the exposed interface is the same, the underlying implementation is much simpler and has less bugs. 虽然暴露的接口是相同的,但底层实现更简单,并且具有更少的错误。

from multiprocessing import Manager
manager = Manager()
result_queue = manager.Queue()

Try it out and see if it doesn't fix your issues. 尝试一下,看看它是否无法解决您的问题。

The multiprocessing.Queue is said to be thread-safe in its documentations. 多处理.Queue在其文档中被称为线程安全的。 But when you are doing inter-process communications with Queue, it should be used with multiprocessing.Manager().Queue() 但是当您使用Queue进行进程间通信时,它应该与multiprocessing.Manager()一起使用.Queue()

There's no evidence from the OP post that multiprocessing.Queue does not work. OP帖子中没有证据表明multiprocessing.Queue不起作用。 The code posted by the OP is not at all sufficient to understand what's going on: do they join all the processes? OP发布的代码根本不足以理解正在发生的事情:他们是否加入了所有流程? do they correctly pass the queue to the child processes (has to be as a parameter if it's on Windows)? 他们是否正确地将队列传递给子进程(如果它在Windows上,则必须作为参数)? do their child processes verify that they actually got 10000 tuples? 他们的子进程是否验证他们实际上有10000个元组? etc. 等等

There's a chance that the OP is really encountering a hard-to-reproduce bug in mp.Queue , but given the amount of testing CPython has gone through, and the fact that I just ran 100 processes x 10000 results without any trouble, I suspect the OP actually had some problem in their own code. OP有可能在mp.Queue遇到难以重现的错误,但考虑到CPython已经完成的测试数量,以及我只是运行100个进程x 10000结果而没有任何麻烦的事实,我怀疑OP实际上在他们自己的代码中遇到了一些问题。

Yes, Manager().Queue() mentioned in other answers is a perfectly fine way to share data, but there's no reason to avoid multiprocessing.Queue() based on unconfirmed reports that "something is wrong with it". 是的, Manager().Queue()在其他答案中提到的Manager().Queue()是一种非常好的共享数据的方式,但是没有理由避免multiprocessing.Queue()基于未经证实的报告“它有问题”。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM