使用兩個1d數組有效索引2d numpy數組

Question

我有一個大的2d numpy數組和兩個1d數組，它們代表2d數組中的x / y索引。 我想使用這些1d數組對2d數組執行操作。 我可以使用for循環來做到這一點，但是在大型數組上工作時非常慢。 有沒有更快的方法？ 我嘗試將1d數組簡單地用作索引，但這沒有用。 請參閱以下示例：

import numpy as np

# Two example 2d arrays
cnt_a   =   np.zeros((4,4))
cnt_b   =   np.zeros((4,4))

# 1d arrays holding x and y indices
xpos    =   [0,0,1,2,1,2,1,0,0,0,0,1,1,1,2,2,3]
ypos    =   [3,2,1,1,3,0,1,0,0,1,2,1,2,3,3,2,0]

# This method works, but is very slow for a large array
for i in range(0,len(xpos)):
    cnt_a[xpos[i],ypos[i]] = cnt_a[xpos[i],ypos[i]] + 1

# This method is fast, but gives incorrect answer
cnt_b[xpos,ypos] = cnt_b[xpos,ypos]+1


# Print the results
print 'Good:'
print cnt_a
print ''
print 'Bad:'
print cnt_b

輸出是：

Good:
[[ 2.  1.  2.  1.]
 [ 0.  3.  1.  2.]
 [ 1.  1.  1.  1.]
 [ 1.  0.  0.  0.]]

Bad:
[[ 1.  1.  1.  1.]
 [ 0.  1.  1.  1.]
 [ 1.  1.  1.  1.]
 [ 1.  0.  0.  0.]]

對於cnt_b數組，numpy顯然不能正確求和，但是我不確定如何解決此問題而不求助於用於計算cnt_a的（v。低效）for循環。

Answer 1

使用一維索引（由@Shai建議）的另一種方法擴展為回答實際問題：

>>> out = np.zeros((4, 4))
>>> idx = np.ravel_multi_index((xpos, ypos), out.shape) # extract 1D indexes
>>> x = np.bincount(idx, minlength=out.size)
>>> out.flat += x

np.bincount計算每個索引在xpos, ypos存在多少次並將其存儲在x 。

或者，按照@Divakar的建議：

>>> out.flat += np.bincount(idx, minlength=out.size)

Answer 2

我們可以計算線性索引 ，然后使用np.add.at 累積到零初始化的輸出數組中。 因此，以xpos和ypos作為數組，這是一種實現-

m,n = xpos.max()+1, ypos.max()+1
out = np.zeros((m,n),dtype=int)
np.add.at(out.ravel(), xpos*n+ypos, 1)

樣品運行-

In [95]: # 1d arrays holding x and y indices
    ...: xpos    =   np.array([0,0,1,2,1,2,1,0,0,0,0,1,1,1,2,2,3])
    ...: ypos    =   np.array([3,2,1,1,3,0,1,0,0,1,2,1,2,3,3,2,0])
    ...: 

In [96]: cnt_a   =   np.zeros((4,4))

In [97]: # This method works, but is very slow for a large array
    ...: for i in range(0,len(xpos)):
    ...:     cnt_a[xpos[i],ypos[i]] = cnt_a[xpos[i],ypos[i]] + 1
    ...:     

In [98]: m,n = xpos.max()+1, ypos.max()+1
    ...: out = np.zeros((m,n),dtype=int)
    ...: np.add.at(out.ravel(), xpos*n+ypos, 1)
    ...: 

In [99]: cnt_a
Out[99]: 
array([[ 2.,  1.,  2.,  1.],
       [ 0.,  3.,  1.,  2.],
       [ 1.,  1.,  1.,  1.],
       [ 1.,  0.,  0.,  0.]])

In [100]: out
Out[100]: 
array([[2, 1, 2, 1],
       [0, 3, 1, 2],
       [1, 1, 1, 1],
       [1, 0, 0, 0]])

Answer 3

您可以一次在兩個列表上進行迭代，並為每對夫婦遞增（如果您不習慣， zip可以合並列表）

for x, y in zip(xpos, ypos):
    cnt_b[x][y] += 1

但這將與解決方案A的速度大致相同。如果列表xpos / ypos的長度為n，則我看不到如何在小於o（n）的范圍內更新矩陣，因為您必須檢查每個矩陣以一種或另一種方式配對。

其他解決方案：您可以計數（可能使用collections.Counter ）相似的索引對（例如：（0，3）等），並使用計數值更新矩陣。 但是我懷疑這樣做會快得多，因為您在更新矩陣時所花費的時間會因計算多次出現而浪費。

也許我是完全錯誤的壽，在這種情況下，我很好奇，也看不O（N）答案

Answer 4

我認為您正在尋找ravel_multi_index

lidx = np.ravel_multi_index((xpos, ypos), cnt_a.shape)

轉換為“展平”的一cnt_a cnt_b到cnt_a和cnt_b ：

np.add.at( cnt_b, lidx, 1 )

使用兩個1d數組有效索引2d numpy數組

問題描述

4 個解決方案

解決方案1
3 2017-06-28 12:32:17

解決方案2
2 已采納 2017-06-28 12:28:51

解決方案3
0 2017-06-28 12:18:17

解決方案4
0 2017-06-28 12:21:21

使用兩個1d數組有效索引2d numpy數組

問題描述

4 個解決方案

解決方案1 3 2017-06-28 12:32:17

解決方案2 2 已采納 2017-06-28 12:28:51

解決方案3 0 2017-06-28 12:18:17

解決方案4 0 2017-06-28 12:21:21

解決方案1
3 2017-06-28 12:32:17

解決方案2
2 已采納 2017-06-28 12:28:51

解決方案3
0 2017-06-28 12:18:17

解決方案4
0 2017-06-28 12:21:21