简体   繁体   English

multiprocessing.Pool进程锁定到单个核心

[英]multiprocessing.Pool processes locked to a single core

I'm using multiprocessing.Pool in Python on Ubuntu 12.04, and I'm running into a curious problem; 我在Ubuntu 12.04上使用Python中的multiprocessing.Pool,我遇到了一个奇怪的问题; When I call map_async on my Pool, I spawn 8 processes, but they all struggle for dominance over a single core of my 8-core machine. 当我在我的Pool上调用map_async时,我会生成8个进程,但是他们都在为我的8核机器的单个核心争夺统治权。 The exact same code uses up both of my cores in my Macbook Pro, and all four cores of my other Ubuntu 12.04 desktop (as measured with htop , in all cases). 完全相同的代码占用了我的Macbook Pro中的两个核心,以及我的其他Ubuntu 12.04桌面的所有四个核心(在所有情况下都用htop测量)。

My code is too long to post all of, but the important part is: 我的代码太长了,无法发布所有内容,但重要的部分是:

P = multiprocessing.Pool()
results = P.map_async( unwrap_self_calc_timepoint, zip([self]*self.xLen,xrange(self.xLen)) ).get(99999999999)
P.close()
P.join()
ipdb.set_trace()

where unwrap_self_calc_timepoint is a wrapper function to pass the necessary self argument to a class, based on the advice of this article . 其中unwrap_self_calc_timepoint是一个包装函数,用于根据本文的建议将必要的self参数传递给类。

All three computers are using Python 2.7.3, and I don't really know where to start in hunting down why that one Ubuntu computer is acting up. 所有三台计算机都在使用Python 2.7.3,我真的不知道从哪里开始寻找为什么一台Ubuntu计算机正在运行。 Any help as to how to begin narrowing the problem down would be helpful. 任何有关如何开始缩小问题的帮助都会有所帮助。 Thank you! 谢谢!

我有同样的问题,在我的情况下,解决方案是告诉linux在整个处理器上工作而不只是一个:尝试在代码的开头添加以下2行:

import os os.system("taskset -p 0xfffff %d" % os.getpid())

This seems to be a fairly common issue between numpy and certain Linux distributions. 这似乎是numpy和某些Linux发行版之间相当普遍的问题。 I haven't had any luck using taskset near the start of the program, but it does do the trick when used in the code to be parallelized: 我没有在程序开始附近使用taskset运气,但是在并行化的代码中使用时它确实可以解决问题:

import multiprocessing as mp
import numpy as np
import os

def something():
    os.system("taskset -p 0xfffff %d" % os.getpid())
    X = np.random.randn(5000,2000)
    Y = np.random.randn(2000,5000)
    Z = np.dot(X,Y)
    return Z.mean()

pool = mp.Pool(processes=10)
out = pool.map(something, np.arange(20))
pool.close()
pool.join()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM