简体   繁体   English

multiprocessing.Pool似乎在Windows中工作但在ubuntu中不起作用?

[英]multiprocessing.Pool seems to work in Windows but not in ubuntu?

SOLVED: The problem was Wingware Python IDE. 解决:问题是Wingware Python IDE。 I guess the natural question now is how it is possible and how this could be fixed. 我想现在的自然问题是它是如何可行的以及如何解决这个问题。

I asked a question yesterday ( Problem with multiprocessing.Pool in Python ) and this question is almost the same but I have figured out that it seems to work on a Windows computer and not in my ubuntu. 我昨天问了一个问题( Python中的multiprocessing.Pool的问题 ),这个问题几乎是一样的,但我发现它似乎可以在Windows计算机上运行,​​而不是在我的ubuntu中运行。 At the end of this post I will post a slightly different version of the code that does the same thing. 在这篇文章的最后,我将发布一个稍微不同的代码版本,它做同样的事情。

Short summary of my problem: When using multiprocessing.Pool in Python I am not always able to get the amount of workers that I am asking for. 我的问题的简短摘要:在Python中使用multiprocessing.Pool时,我并不总能获得我要求的工作量。 When this happens, the program just stalls. 当发生这种情况时,程序就会停止。

I have been working for a solution all day, and then I came to think about Noahs' comment on my previous question. 我一整天都在努力寻找解决方案,然后我开始考虑诺亚斯对我之前的问题的评论。 He said that it worked on his machine so I gave the code to my colleague who runs a Windows machine with Enthoughts 64-bit Python 2.7.1 distribution. 他说它在他的机器上工作,所以我把代码交给我的同事,他运行一台带有Enthoughts 64位Python 2.7.1发行版的Windows机器。 I have the same with the big difference that mine runs on ubuntu. 我和ubuntu上运行的差别很大。 I also mention that we both have Wingware Python IDE, but I doubt that this is of any importance? 我还提到我们都有Wingware Python IDE,但我怀疑这有什么重要意义吗?

There are two problems with my code that don't arise when my colleague runs the code on his machine. 当我的同事在他的机器上运行代码时,我的代码有两个问题。

  1. I am not always able to get the four workers I am asking for (Although my machine has 12 workers). 我并不总能得到我要求的四名工人(虽然我的机器有12名工人)。 When this happens, the process just stalls and does not continue. 当发生这种情况时,该过程就会停止并且不会继续。 No exception or Error is raised. 没有异常或错误引发。

  2. When I am able to get the four workers I ask for (which happens approximately 1 out 5 times or so), the figures that are produced (plain random numbers) are EXACTLY the same for all four pictures. 当我能够得到我要求的四个工人(大约发生约1次5次左右)时,所产生的数字(普通随机数)对于所有四张图片都是完全相同的。 This is not the case for my colleague. 我的同事不是这种情况。

Something is very fishy and I am very thankful for any kind of help you guys can offer. 有些东西非常可疑,我非常感谢你们提供的任何帮助。

The code: 代码:

import multiprocessing as mp
import scipy as sp
import scipy.stats as spstat
import pylab

def testfunc(x0, N):
    print 'working with x0 = %s' % x0
    x = [x0]
    for i in xrange(1,N):
        x.append(spstat.norm.rvs(size = 1)) # stupid appending to make it slower
        if i % 10000 == 0:
            print 'x0 = %s, i = %s' % (x0, i)
    return sp.array(x)

def testfuncParallel(fargs):
    return testfunc(*fargs)


# Define Number of tasks.
nTasks = 4
N = 100000

if __name__ == '__main__':

    """
    Try number 1. Using multiprocessing.Pool together with Pool.map_async
    """
    pool = mp.Pool(processes = nTasks) # I have 12 threads (six cores) available so I am suprised that it does not get access to nTasks = 4 amount of workers

    # Define tasks:
    tasks = [(x, n) for x, n in enumerate(nTasks*[N])] # nTasks different tasks

    # Compute parallel: async - asynchronically, i.e. not necessary in order.
    result = pool.map_async(testfuncParallel, tasks)

    pool.close() # These are needed if map_async is used
    pool.join()

    # Get results:
    sim = sp.zeros((N, nTasks)) 

    for nn, res in enumerate(result.get()):    
        sim[:, nn] = res

    pylab.figure()
    for i in xrange(nTasks):
        pylab.subplot(nTasks,1, i + 1)
        pylab.plot(sim[:, i])

    pylab.show()

Thanks in advance. 提前致谢。

Sincerely, Matias 真诚的,Matias

I don't have a solution for your first problem. 我没有第一个问题的解决方案。 In fact, I can run your code repeatedly without fail on my 64-bit Ubuntu box with Enthought's Python 2.7.1 [EPD 7.0-2 (64-bit)]. 事实上,我可以在我的64位Ubuntu盒子上使用Enthought的Python 2.7.1 [EPD 7.0-2(64位)]重复运行你的代码。 edit : It turns out the problem was being caused by your IDE (Wingware). 编辑 :事实证明问题是由您的IDE(Wingware)引起的。 The obvious workaround is to run the script from outside the IDE. 显而易见的解决方法是从IDE外部运行脚本。

As to the second question, what happens is that on Unix every worker process inherits the same state of the random number generator from the parent process. 至于第二个问题,会发生什么是在Unix上,每个工作进程从父进程继承相同的随机数生成器状态。 This is why they generate identical pseudo-random sequences. 这就是他们生成相同的伪随机序列的原因。 All you have to do to fix this is call scipy.random.seed at the top of testfunc : 你需要做的就是在scipy.random.seed的顶部调用testfunc

def testfunc(x0, N):
    sp.random.seed()
    print 'working with x0 = %s' % x0
    ...

Update: Turns out this had nothing to do with matplotlib or the backends but rather with a bug associated with multiprocessing in general. 更新:事实证明这与matplotlib或后端无关,而是与一般的多处理相关的错误。 We've fixed this for Wing version 4.0.4+. 我们已经为Wing 4.0.4+修复了这个问题。 The work-around is not to set breakpoints in the code that is executed in the sub-processes. 解决方法不是在子流程中执行的代码中设置断点。

It seems to be Wing IDE's matplotlib support for the Tkinter backend interacting badly with multiprocessing. 看起来Wing IDE的matplotlib支持Tkinter后端与多处理交互不良。 When I try this example it crashes in TCL/Tk code. 当我尝试这个例子时,它崩溃了TCL / Tk代码。 I suspect the person working on Windows was using a different matplotlib backend. 我怀疑在Windows上工作的人使用的是另一个matplotlib后端。

Turning off the "matplotlib event loop support" in Project Properties under the Extensions tab seems to work around it. 在“扩展”选项卡下的“项目属性”中关闭“matplotlib事件循环支持”似乎可以解决它。

Or, adding the following seems to fix it for me when the "matplotlib event loop support" is turned on. 或者,当打开“matplotlib事件循环支持”时,添加以下内容似乎可以解决它。

import matplotlib matplotlib.use('WXAgg') import matplotlib matplotlib.use('WXAgg')

This will only work if you have the WXAgg backend. 只有拥有WXAgg后端才能使用此功能。 Other backends supported by Wing IDE (in such a way that plots remain interactive even if the debug process is paused) are GTKAgg and Qt4Agg but I didn't try those yet. Wing IDE支持的其他后端(即使调试过程暂停,图表仍保持交互状态)是GTKAgg和Qt4Agg,但我还没有尝试过。

I'll see if I can find and fix the bug. 我会看看能否找到并修复这个bug。 I suspect we need to disable our event loop support when the process ID changes. 我怀疑我们需要在进程ID更改时禁用事件循环支持。 Thanks for reporting this. 感谢您报告此事。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM