简体   繁体   中英

multiprocessing.Pool processes locked to a single core

I'm using multiprocessing.Pool in Python on Ubuntu 12.04, and I'm running into a curious problem; When I call map_async on my Pool, I spawn 8 processes, but they all struggle for dominance over a single core of my 8-core machine. The exact same code uses up both of my cores in my Macbook Pro, and all four cores of my other Ubuntu 12.04 desktop (as measured with htop , in all cases).

My code is too long to post all of, but the important part is:

P = multiprocessing.Pool()
results = P.map_async( unwrap_self_calc_timepoint, zip([self]*self.xLen,xrange(self.xLen)) ).get(99999999999)
P.close()
P.join()
ipdb.set_trace()

where unwrap_self_calc_timepoint is a wrapper function to pass the necessary self argument to a class, based on the advice of this article .

All three computers are using Python 2.7.3, and I don't really know where to start in hunting down why that one Ubuntu computer is acting up. Any help as to how to begin narrowing the problem down would be helpful. Thank you!

我有同样的问题,在我的情况下,解决方案是告诉linux在整个处理器上工作而不只是一个:尝试在代码的开头添加以下2行:

import os os.system("taskset -p 0xfffff %d" % os.getpid())

This seems to be a fairly common issue between numpy and certain Linux distributions. I haven't had any luck using taskset near the start of the program, but it does do the trick when used in the code to be parallelized:

import multiprocessing as mp
import numpy as np
import os

def something():
    os.system("taskset -p 0xfffff %d" % os.getpid())
    X = np.random.randn(5000,2000)
    Y = np.random.randn(2000,5000)
    Z = np.dot(X,Y)
    return Z.mean()

pool = mp.Pool(processes=10)
out = pool.map(something, np.arange(20))
pool.close()
pool.join()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM