简体   繁体   English

Numpy向量化索引总和

[英]Numpy vectorize sum over indices

I have a list of indices (list(int)) and a list of summing indices (list(list(int)). Given a 2D numpy array, I need to find the sum over indices in the second list for each column and add them to the corresponding indices in the first column. Is there any way to vectorize this? Here is the normal code: 我有一个索引列表(list(int))和一个汇总索引列表(list(list(int))。给定2D numpy数组,我需要在第二个列表的每一列的索引上找到总和,然后添加它们到第一列中的相应索引。有什么方法可以向量化吗?这是正常代码:

indices = [1,0,2]
summing_indices = [[5,6,7],[6,7,8],[4,5]]
matrix = np.arange(9*3).reshape((9,3))
for c,i in enumerate(indices):
    matrix[i,c] = matrix[summing_indices[i],c].sum()+matrix[i,c]

Here's an almost* vectorized approach using np.add.reduceat - 这是使用np.add.reduceat的几乎*矢量化方法-

lens = np.array(map(len,summing_indices))
col = np.repeat(indices,lens)
row = np.concatenate(summing_indices)
vals = matrix[row,col]
addvals = np.add.reduceat(vals,np.append(0,lens.cumsum()[:-1]))
matrix[indices,np.arange(len(indices))] += addvals[indices.argsort()]

Please note that this has some setup overhead, so it would be best suited for 2D input arrays with a good number of columns as we are iterating along the columns. 请注意,这会有一些设置开销,因此,当我们沿着列进行迭代时,它最适合列数很多的2D输入数组。

*: Almost because of the use of map() at the start, but computationally that should be negligible. *:几乎是因为开始时使用了map() ,但是从计算上讲应该可以忽略不计。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM