[英]Improve performance of combined for blocks with numpy
我有下面定義的相當簡單的函數func1()
,主要由兩個for
塊組成。
它適用於N
較小的值,但是由於for
塊被組合在一起,所以它非常迅速地增長到N>1000
時要花幾分鍾的時間。
如何使用numpy
廣播來改善此功能的性能?
import numpy as np
import time as t
def func1(A_triang, B_triang):
aa = []
for i, A_tr in enumerate(A_triang):
for j, B_tr in enumerate(B_triang):
# Absolute value of differences.
abs_diff = abs(np.array(A_tr) - np.array(B_tr))
# Store the sum of the differences and the indexes
aa.append([sum(abs_diff), i, j])
return aa
# Generate random data with the proper format
N = 500
A_triang = np.random.uniform(0., 20., (N, 3))
A_triang[:, 0] = np.ones(N)
B_triang = np.random.uniform(0., 20., (N, 3))
B_triang[:, 0] = np.ones(N)
# Call function.
s = t.clock()
aa = func1(A_triang, B_triang)
print(t.clock() - s)
這是NumPy broadcasting
的一個,利用indices_merged_arr_generic_using_cp
的修改版本進行索引分配-
import functools
# Based on https://stackoverflow.com/a/46135435/ @unutbu
def indices_merged_arr_generic_using_cp(arr):
"""
Based on cartesian_product
http://stackoverflow.com/a/11146645/190597 (senderle)
"""
shape = arr.shape
arrays = [np.arange(s, dtype='int') for s in shape]
broadcastable = np.ix_(*arrays)
broadcasted = np.broadcast_arrays(*broadcastable)
rows, cols = functools.reduce(np.multiply, broadcasted[0].shape),\
len(broadcasted)+1
out = np.empty(rows * cols, dtype=arr.dtype)
start, end = rows, 2*rows
for a in broadcasted:
out[start:end] = a.reshape(-1)
start, end = end, end + rows
out[0:rows] = arr.flatten()
return out.reshape(cols, rows).T
def func1_numpy_broadcasting(a,b):
val = np.abs(a[:,None,:] - b).sum(-1)
return indices_merged_arr_generic_using_cp(val)
如果輸入的第一列始終為1s
,那么我們就不需要計算它們的差,因為它們的差之和為零。 因此,或者獲取val
,我們可以簡單地使用最后兩列-
val = np.abs(a[:,1,None] - b[:,1]) + np.abs(a[:,2,None] - b[:,2])
這將節省內存,因為我們不會以這種方式進行3D
。
使用numexpr
模塊-
import numexpr as ne
def func1_numexpr_broadcasting(a,b):
a3D = a[:,None,:]
val = ne.evaluate('sum(abs(a3D - b),2)')
return indices_merged_arr_generic_using_cp(val)
利用第一列為1s
的事實,我們可以-
def func1_numexpr_broadcasting_v2(a,b):
a1 = a[:,1,None]
b1 = b[:,1]
a2 = a[:,2,None]
b2 = b[:,2]
val = ne.evaluate('abs(a1-b1) + abs(a2-b2)')
return indices_merged_arr_generic_using_cp(val)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.