[英]Numba performance issue with np.nan and np.inf
我正在玩numba
來加速我的代碼。 我注意到在 function 中使用np.inf
而不是np.nan
時,性能差異很大。 下面我附上了三個示例函數進行說明。
function1
不會被numba
加速。function2
和function3
都由numba
加速,但一個使用np.nan
而另一個使用np.inf
。 在我的機器上,三個函數的平均運行時間分別為0.032284s
、 0.041548s
和0.019712s
。 使用np.nan
似乎比np.inf
慢得多。 為什么性能差異很大? 提前致謝。
編輯:我正在使用Python 3.7.11
和Numba 0.55.Orc1
。
import numpy as np
import numba as nb
def function1(array1, array2):
nr, nc = array1.shape
output1 = np.empty((nr, nc), dtype='float')
output2 = np.empty((nr, nc), dtype='float')
output1[:] = np.nan
output2[:] = np.nan
for r in range(nr):
row1 = array1[r]
row2 = array2[r]
diff = row1 - row2
id_threshold =np.nonzero( (row1 - row2) > 8 )
output1[r][id_threshold] = 1
output2[r][id_threshold] = 0
output1 = output1.flatten()
output2 = output2.flatten()
id_keep = np.nonzero(output1 != np.nan)
output1 = output1[id_keep]
output2 = output2[id_keep]
output = np.vstack((output1, output2))
return output
@nb.njit('float64[:,::1](float64[:,::1], float64[:,::1])', parallel=True)
def function2(array1, array2):
nr, nc = array1.shape
output1 = np.empty((nr,nc), dtype='float')
output2 = np.empty((nr, nc), dtype='float')
output1[:] = np.nan
output2[:] = np.nan
for r in nb.prange(nr):
row1 = array1[r]
row2 = array2[r]
diff = row1 - row2
id_threshold =np.nonzero( (row1 - row2) > 8 )
output1[r][id_threshold] = 1
output2[r][id_threshold] = 0
output1 = output1.flatten()
output2 = output2.flatten()
id_keep = np.nonzero(output1 != np.nan)
output1 = output1[id_keep]
output2 = output2[id_keep]
output = np.vstack((output1, output2))
return output
@nb.njit('float64[:,::1](float64[:,::1], float64[:,::1])', parallel=True)
def function3(array1, array2):
nr, nc = array1.shape
output1 = np.empty((nr,nc), dtype='float')
output2 = np.empty((nr, nc), dtype='float')
output1[:] = np.inf
output2[:] = np.inf
for r in nb.prange(nr):
row1 = array1[r]
row2 = array2[r]
diff = row1 - row2
id_threshold =np.nonzero( (row1 - row2) > 8 )
output1[r][id_threshold] = 1
output2[r][id_threshold] = 0
output1 = output1.flatten()
output2 = output2.flatten()
id_keep = np.nonzero(output1 != np.inf)
output1 = output1[id_keep]
output2 = output2[id_keep]
output = np.vstack((output1, output2))
return output
array1 = 10*np.random.random((1000,1000))
array2 = 10*np.random.random((1000,1000))
output1 = function1(array1, array2)
output2 = function2(array1, array2)
output3 = function3(array1, array2)
第二個要慢得多,因為output1.= np.nan
返回一個副本output1
,因為np.nan.= np.nan
為True
(與任何其他值一樣 - v.= np.nan
始終為 true)。 因此,要計算的結果數組要大得多,從而導致執行速度變慢。
關鍵是您絕不能使用比較運算符將值與np.nan
進行比較:改用np.isnan(value)
。 在您的情況下,您應該使用np.logical_not(np.isnan(output1))
。
由於np.logical_not
創建的臨時數組,第二個實現可能會稍微慢一些(在更正代碼后,我沒有看到在我的機器上使用 NaN 或 Inf 之間有任何統計上的顯着差異)。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.