简体   繁体   English

numpy/scipy 中的哪些操作是多线程的?

[英]Which operations in numpy/scipy are multithreaded?

I'm working on an algorithm, and I've made no attempt to parallelize it other than just by using numpy/scipy.我正在研究一种算法,除了使用 numpy/scipy 之外,我没有尝试对其进行并行化。 Looking at htop , sometimes the code uses all of my cores and sometimes just one.查看htop ,有时代码使用我的所有内核,有时只使用一个内核。 I'm considering adding parallelism to the single-threaded portions using multiprocessing or something similar.我正在考虑使用multiprocessing或类似的东西向单线程部分添加并行性。

Assuming that I have all of the parallel BLAS/MKL libraries, is there some rule of thumb that I can follow to guess whether a numpy/scipy ufunc is going to be multithreaded or not?假设我拥有所有并行的 BLAS/MKL 库,是否有一些经验法则可以用来猜测 numpy/scipy ufunc 是否将是多线程的? Even better, is there some place where this is documented?更好的是,是否有记录在案的地方?

To try to figure this out, I've looked at: https://scipy.github.io/old-wiki/pages/ParallelProgramming , Python: How do you stop numpy from multithreading?为了弄清楚这一点,我查看了: https://scipy.github.io/old-wiki/pages/ParallelProgramming,Python :如何从多线程中停止 numpy? , multithreaded blas in python/numpy . , python/numpy 中的多线程 blas

You may try to take IDP package ( Intel® Distribution for Python ) which contains versions of NumPy*, SciPy*, and scikit-learn* with integrated Intel® Math Kernel Library. 您可以尝试获取IDP软件包( 用于Python的英特尔®发行版),其中包含NumPy *,SciPy *和scikit-learn *版本以及集成的英特尔®数学内核库。

This would give you threading of all Lapack routines automatically whether this make sense to do. 无论这样做是否合理,这都会使您自动对所有Lapack例程进行线程化。 Here you find out the list of threaded mkl's functions : https://software.intel.com/en-us/mkl-linux-developer-guide-openmp-threaded-functions-and-problems 在这里,您可以找到线程mkl的功能列表: https ://software.intel.com/zh-cn/mkl-linux-developer-guide-openmp-threaded-functions-and-problems

The routines intrinsic to numpy and scipy allow single threads by default. numpyscipy固有的例程默认情况下允许单线程。 You can change that if you so choose. 您可以根据需要进行更改。

# encoding: utf-8
# module numpy.core.multiarray
# from /path/to/anaconda/lib/python3.6/site-packages/numpy/core/multiarray.cpython-36m-darwin.so
# by generator 1.145
# no doc
# no imports

# Variables with simple values

ALLOW_THREADS = 1

When compiling numpy , you can control threading by changing NPY_ALLOW_THREADS : 编译numpy ,可以通过更改NPY_ALLOW_THREADS来控制线程:

./core/include/numpy/ufuncobject.h:#if NPY_ALLOW_THREADS
./core/include/numpy/ndarraytypes.h:        #define NPY_ALLOW_THREADS 1

As for the external libraries, I've mostly found numpy and scipy to wrap around legacy Fortran code ( QUADPACK , LAPACK , FITPACK ... so on). 至于外部库,我主要发现numpyscipy可以包裹旧版Fortran代码( QUADPACKLAPACKFITPACK等等)。 All the subroutines in these libraries compute on single threads. 这些库中的所有子例程都在单个线程上计算。

As for the MKL dependencies, the SO posts you link to sufficiently answer the question. 至于MKL依存关系,您链接的SO帖子足以回答问题。

please try to set the global variable OMP_NUM_THREADS .请尝试设置全局变量OMP_NUM_THREADS It works for my scipy and numpy. The functions I use are,它适用于我的 scipy 和 numpy。我使用的功能是,

ling.inv() and A.dot(B) ling.inv()A.dot(B)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM