docker 中的 Python 代码不使用所有可用的 CPU 内核（仅使用一个）

Question

I am using AWS Batch to run a python script with few modules that run in parallel (in a docker container on AWS ECR).我正在使用 AWS Batch 运行一个 python 脚本，其中包含几个并行运行的模块（在 AWS ECR 上的 docker 容器中）。 When I manually invoke the script on a Linux 16 core machine, I see 16 python processes executing the code in parallel.当我在 Linux 16 核机器上手动调用脚本时，我看到 16 个 python 进程并行执行代码。

In hopes of speeding up the run further, I wanted to use AWS Batch to run the same script by autoscale to 64 cores.为了进一步加快运行速度，我想使用 AWS Batch 通过自动扩展到 64 个内核来运行相同的脚本。 But, this method is only spinning up one python process — Which is obviously slower than my initial approach.但是，这种方法只会启动一个 python 进程——这显然比我最初的方法慢。

Other details: The parallel python method I am running is pairwise_distances (Built on joblib library) I built the docker image on a Windows 10 machine, pushed it to ECR and invoked its run using AWS Batch.其他细节：并行蟒蛇方法，我跑是pairwise_distances（建于JOBLIB库）我建立一个Windows 10机器上的泊坞窗图像，它推到ECR和使用AWS批次调用它的运行。

Am I missing something critical to invoke python's parallel backend or are there any docker configuration settings that I didn't configure.我是否缺少调用 python 并行后端的关键内容，或者是否有任何我没有配置的 docker 配置设置。 Thanks a lot for your help in advance.非常感谢您的帮助。

Sample Python Code: script.py示例 Python 代码：script.py

import pandas as pd
import numpy as np
from sklearn.metrics.pairwise import pairwise_distances

X = pd.DataFrame(np.random.randint(0,100,size=(1000, 4)), columns=list('ABCD'))
Y = pd.DataFrame(np.random.randint(0,100,size=(10000, 4)), columns=list('ABCD'))

output = pd.DataFrame(
    pairwise_distances(X.to_numpy(),Y.to_numpy(), metric= lambda u, v: round((np.sum( np.minimum(u,v), axis = 0)/np.sum(u,axis= 0))*100,2) , n_jobs=-1),
    columns = Y.index,
    index = X.index
)

pd.DataFrame.to_csv(output, 'outputData.csv', sep=',', na_rep='', index=False)

Dockerfile: Dockerfile：

python:3.7
ADD script.py /
COPY requirements.txt /tmp/
RUN pip install -r /tmp/requirements.txt
CMD ["python", "./script.py"]

requirements.txt:要求.txt：

pandas
numpy
sklearn
joblib

Answer 1

Does it make a difference if you wrap the code-to-be-parallelized in a joblib.Parallel() context manager ?如果将要并行化的代码包装在joblib.Parallel()上下文管理器中，会产生什么不同吗？

X = pd.DataFrame(np.random.randint(0, 100, size=(1000, 4)), columns=list("ABCD"))
Y = pd.DataFrame(np.random.randint(0, 100, size=(10000, 4)), columns=list("ABCD"))

with joblib.Parallel(n_jobs=-1):
    distances = pairwise_distances(
        X.to_numpy(),
        Y.to_numpy(),
        metric=lambda u, v: round(
            (np.sum(np.minimum(u, v), axis=0) / np.sum(u, axis=0)) * 100, 2
        ),
        n_jobs=-1,
    )

output = pd.DataFrame(distances, columns=Y.index, index=X.index)
# ...

docker 中的 Python 代码不使用所有可用的 CPU 内核（仅使用一个）

问题描述

1 个解决方案

解决方案1
0 2020-03-25 07:46:58

docker 中的 Python 代码不使用所有可用的 CPU 内核（仅使用一个）

问题描述

1 个解决方案

解决方案1 0 2020-03-25 07:46:58

解决方案1
0 2020-03-25 07:46:58