[英]Anaconda MKL can't set number of threads
I was using numpy
from anaconda to benchmark a big matrix multiplication ( 8192x8192
of type float32
) like this: (in jupyter) 我正在使用来自蟒蛇的
numpy
来基准测试大型矩阵乘法( float32
类型的8192x8192
),如下所示:(在jupyter中)
import numpy as np
a = np.empty((8192, 8192), 'f')
%timeit a @ a
The numpy
is build against MKL
. numpy
是针对MKL
构建的。 When doing the multiplication (continuously), I find the CPU usage of python is always 50%. 当进行乘法运算(连续)时,我发现python的CPU使用率始终为50%。 I'm wondering why it isn't 100% (since matrix multiplication should be automatically palatalized).
我想知道为什么它不是100%(因为矩阵乘法应该自动按参数表示)。 I therefore googled around and find two ways to set the number of threads
MKL
uses. 因此,我四处搜寻,找到了两种方法来设置
MKL
使用的线程数。
One way is directly using the DLL: 一种方法是直接使用DLL:
from ctypes import CDLL
mkl = CDLL('../conda/pkgs/mkl-2019.0-118/Library/bin/mkl_rt.dll')
print(mkl.MKL_Set_Num_Threads(4))
print(mkl.MKL_Get_Max_Threads())
which I believe gave me some unknown error code and failed to set: 我相信给了我一些未知的错误代码,但未能设置:
-899695632
2
Another way is through mkl-service
package: 另一种方法是通过
mkl-service
包:
import mkl
print(mkl.set_num_threads(4))
print(mkl.get_max_threads())
which also didn't success. 这也没有成功。
None
2
I'm wondering why is setting 4 threads in MKL keep failing and how to make it work. 我想知道为什么在MKL中设置4个线程会不断失败,以及如何使其正常工作。 I'm under
Win7
, 64bit
. 我在
64bit
Win7
下。 My CPU is i5-2520M
which should have 4 core. 我的CPU是
i5-2520M
,应该具有4核。 My anaconda environment looks like: (abbreviated) 我的anaconda环境看起来像:(略)
mkl 2019.0 118
mkl-service 1.1.2 py36hb217b18_5
mkl_fft 1.0.6 py36hdbbee80_0
mkl_random 1.0.1 py36h77b88f5_1
numpy 1.15.3 py36ha559c80_0
numpy-base 1.15.3 py36h8128ebf_0
zeromq 4.2.5 he025d50_1
Please consider this documentation: https://software.intel.com/en-us/articles/intel-math-kernel-library-intel-mkl-intel-mkl-100-threading 请考虑以下文档: https : //software.intel.com/zh-cn/articles/intel-math-kernel-library-intel-mkl-intel-mkl-100-threading
The key variable is MKL_NUM_THREADS
, which you can set as a global Windows variable. 关键变量是
MKL_NUM_THREADS
,您可以将其设置为全局Windows变量。
I strongly disagree with @roro on this. 我对此非常反对@roro。 The reason, why you are seeing the 50% is that you are not using your hyperthreading capabilities.
之所以看到50%,是因为您没有使用超线程功能。 Having said that, bear in mind, that there are 2 limiting factors to speed of calculation: CPU power and!!
话虽这么说,请记住,计算速度有两个限制因素:CPU功率和! memory access bandwidth.
内存访问带宽。 Oftentimes the second will limit the speed to say 70% of your CPU power, cause RAM/cache cannot deliver data fast enough to the algorithm.
通常情况下,第二秒会将速度限制为CPU功率的70%,因为RAM /高速缓存无法将数据足够快地交付给算法。
Getting parallelism right is among the more challenging parts of HPC. 正确实现并行性是HPC更具挑战性的部分。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.