簡體   English   中英

優化numpy數組乘法:*比numpy.dot更快?

[英]Optimizing numpy array multiplication: * faster than numpy.dot?

問題:

1)使用BLAS時,以下示例代碼中的numpy.dot()*慢嗎?

2)有沒有一種方法可以實現numpy.dot()而不是*來實現更快的數組乘法? 我認為我缺少了一些至關重要的信息,這些信息將回答問題1,這意味着numpy.dot()至少與*一樣快,甚至還不及*

詳情如下。 在此先感謝您的回答和幫助。

細節:

我正在編寫一個程序,用於在Windows 7上使用python 2.7(64位),numpy 1.11.2,Anaconda2解決耦合PDE的問題。為了提高程序輸出的准確性,我需要使用大數組(shape(2,2 ^ 14 )以及更大的集成度和較小的集成步驟,導致每次仿真需要大量的數組乘法運算,而我需要對其進行速度優化。

環顧四周后 ,只要已安裝BLAS並使用numpy,就應該將numpy.dot()用於*更快數組乘法。 經常建議這樣做。 但是,當我使用下面的計時器腳本時, *numpy.dot()快至少7倍。在某些情況下,這會增加到> 1000的倍數:

from __future__ import division
import numpy as np
import timeit

def dotter(a, b):
    return np.dot(a, b)

def timeser(a, b):
    return a*b

def wrapper(func, a, b):
    def wrapped():
        return func(a, b)
    return wrapped

size = 100
num = int(3e5)

a = np.random.random_sample((size, size))
b = np.random.random_sample((size, size))

wrapped = wrapper(dotter, a, b)
dotTime = timeit.timeit(wrapped, number=num)/num
print "\nTime for np.dot: ", dotTime

wrapped = wrapper(timeser, a, b)
starTime = timeit.timeit(wrapped, number=num)/num
print "\nTime for *: ", starTime

print "dotTime / starTime: ", dotTime/starTime

輸出:

Time for np.dot:  8.58201189949e-05
Time for *:  1.07564737429e-05
dotTime / starTime:  7.97846218436

numpy.dot()*都分布在多個內核中,我認為這表明BLAS至少在某種程度上起作用:

在此處輸入圖片說明

查看numpy.__config__.show() ,好像我在使用BLAS和lapack(盡管不是openblas_lapack嗎?):

lapack_opt_info:
    libraries = ['mkl_lapack95_lp64', 'mkl_blas95_lp64', 'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'libiomp5md', 'libifportmd']
    library_dirs = ['C:/Program Files (x86)/Intel/Composer XE/mkl/lib/intel64']
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    include_dirs = ['C:/Program Files (x86)/Intel/Composer XE/mkl/include']
blas_opt_info:
    libraries = ['mkl_lapack95_lp64', 'mkl_blas95_lp64', 'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'libiomp5md', 'libifportmd']
    library_dirs = ['C:/Program Files (x86)/Intel/Composer XE/mkl/lib/intel64']
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    include_dirs = ['C:/Program Files (x86)/Intel/Composer XE/mkl/include']
openblas_lapack_info:
  NOT AVAILABLE
lapack_mkl_info:
    libraries = ['mkl_lapack95_lp64', 'mkl_blas95_lp64', 'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'libiomp5md', 'libifportmd']
    library_dirs = ['C:/Program Files (x86)/Intel/Composer XE/mkl/lib/intel64']
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    include_dirs = ['C:/Program Files (x86)/Intel/Composer XE/mkl/include']
blas_mkl_info:
    libraries = ['mkl_lapack95_lp64', 'mkl_blas95_lp64', 'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'libiomp5md', 'libifportmd']
    library_dirs = ['C:/Program Files (x86)/Intel/Composer XE/mkl/lib/intel64']
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    include_dirs = ['C:/Program Files (x86)/Intel/Composer XE/mkl/include']

np.dot調用矩陣矩陣乘法,而*是按元素乘法。 對於Python 3.5+,矩陣矩陣乘法的符號為@

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM