[英]sklearn linear_models .fit() run in multiprocessing pool is slower than in single process for loop
I'v found that LinearRegression.fit()
runs slower with python multiprocessing than simple for-loop, here is my code:我发现
LinearRegression.fit()
使用 python 多处理比简单的 for 循环运行得更慢,这是我的代码:
import multiprocessing as mp
from contextlib import contextmanager
from sklearn.linear_model import LinearRegression, Ridge, Lasso
def generate_random_data():
n_samples = int(1e4)
n_features = 100
X = np.random.normal(size=(n_samples, n_features))
beta = np.random.uniform(size=n_features)
y = X @ beta + np.random.normal(size=n_samples)
return X, y
@contextmanager
def timeit(tag, container=None):
t0 = time.perf_counter()
yield
t1 = time.perf_counter()
if container is None:
print('[{}] : {:.2f} seconds'.format(tag, t1 - t0))
else:
container[tag] = t1 - t0
def test(i):
X, y = generate_random_data()
m = LinearRegression() # Ridge, Lasso produce similar results
with timeit('OLS.fit {}'.format(i)):
m.fit(X, y)
print('===================== MultiProcessing =====================')
with mp.Pool(10) as pool:
pool.map(test, range(10))
print('===================== For Loop =====================')
for i in range(10):
test(i)
output: output:
===================== MultiProcessing =====================
[OLS.fit 5] : 0.76 seconds
[OLS.fit 9] : 0.98 seconds
[OLS.fit 7] : 1.54 seconds
[OLS.fit 0] : 1.58 seconds
[OLS.fit 4] : 1.67 seconds
[OLS.fit 1] : 1.75 seconds
[OLS.fit 8] : 1.80 seconds
[OLS.fit 2] : 1.96 seconds
[OLS.fit 6] : 2.02 seconds
[OLS.fit 3] : 2.09 seconds
===================== For Loop =====================
[OLS.fit 0] : 0.02 seconds
[OLS.fit 1] : 0.02 seconds
[OLS.fit 2] : 0.02 seconds
[OLS.fit 3] : 0.02 seconds
[OLS.fit 4] : 0.02 seconds
[OLS.fit 5] : 0.02 seconds
[OLS.fit 6] : 0.02 seconds
[OLS.fit 7] : 0.02 seconds
[OLS.fit 8] : 0.02 seconds
[OLS.fit 9] : 0.02 seconds
If I change model from LinearRegression to Lasso, Ridge, the result is almost same.如果我将 model 从 LinearRegression 更改为 Lasso, Ridge,结果几乎相同。 But if chagne to other models, eg DecisionTreeRegressor, multiprocessing version almost consume same time as simple for-loop
但是如果换成其他模型,例如DecisionTreeRegressor,多处理版本几乎与简单的for循环消耗相同的时间
my system version: - Ubuntu 18.04 - python 3.7 (Anaconda)我的系统版本: - Ubuntu 18.04 - python 3.7 (Anaconda)
Solved by https://github.com/scikit-learn/scikit-learn/issues/17139由https 解决://github.com/scikit-learn/scikit-learn/issues/17139
It maybe because LinearRegression calls BLAS do linear algebra computations, which itself by default use all CPU threads, making n process from multiprocessing.Pool will lead to CPU oversubscription.可能是因为 LinearRegression 调用 BLAS 做线性代数计算,它本身默认使用所有 CPU 线程,使得 n 个进程从 multiprocessing.Pool 会导致 CPU 超额认购。
Should use joblib
应该使用
joblib
Parallel(10, prefer='processes')(delayed(test)(i) for i in range(10))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.