在二維 numpy 數組中有效地查找一組連續值的索引

Question

我計算了圖像的分割，其中每個超像素（區域）都由與圖像大小相同的 2D 數組中的條目值定義。 我正在嘗試獲取每個區域的索引列表，以便稍后執行每個區域的操作。 這是我當前的代碼：

index_list = []
for i in range(num_superpixels):
    indices = np.where(superpixels == i)
    index_list.append(indices)

以下是一個包含 3 個區域的 3x3 輸入的最小示例。 在實踐中，我使用從 640x480 圖像獲得的 500-1000 個超像素，事情變得非常緩慢。

>>> superpixels
array([[0, 0, 2],
       [0, 0, 2],
       [1, 1, 2]])

>>> index_list
      [[array([0, 0, 1, 1]), array([0, 1, 0, 1])],
       [array([2, 2]), array([0, 1])],
       [array([0, 1, 2]), array([2, 2, 2])]]

由於每個區域都是一個連續的塊（在 2D 圖像中，但不在內存中），因此在循環中使用np.where確實效率低下 - 在每次迭代時，它都會遍歷width*height條目以找到約 500 個條目的區域。

我該如何加快速度？

Answer 1

首先，可以基於區域的直接索引設計更好的算法。 實際上，當前代碼的復雜度為O(width * height * num_superpixels) ，而有可能達到O(width * height)復雜度。 這個想法是在bin[cellValue]中創建num_superpixels箱和 append 每個單元（二維數組）的位置。

請注意，使用 Python 循環實現它會太慢，但您可以使用Numba加快實現速度。 由於 Numba 不喜歡可變大小的 arrays（效率低下），因此可以應用第一次通過來計算每個 bin 中的單元格數量，然后填充單元格位置。

這是一個例子：

from numba import jit, njit, int32, int64, prange
from numba.types import UniTuple, List

@jit(List(UniTuple(int32[::1],2))(int64[:,::1], int64))
def fastCompute(superpixels, num_superpixels):
    # Count the number of elements
    binSize = np.zeros(num_superpixels, dtype=np.int32)
    for i in range(superpixels.shape[0]):
        for j in range(superpixels.shape[1]):
            binSize[superpixels[i,j]] += 1

    # Put the pixels location in the right bin
    result = [(np.empty(binSize[i], dtype=np.int32), np.empty(binSize[i], dtype=np.int32)) for i in range(num_superpixels)]
    binPos = np.zeros(num_superpixels, dtype=np.int32)
    for i in range(superpixels.shape[0]):
        for j in range(superpixels.shape[1]):
            binIdx = superpixels[i,j]
            tmp = result[binIdx]
            cellBinPos = binPos[binIdx]
            tmp[0][cellBinPos] = i
            tmp[1][cellBinPos] = j
            binPos[binIdx] += 1

    return result

在我的機器上，使用以下基於隨機的配置，上面的 function 比初始代碼快 120 倍。

# Generate a random input
num_superpixels = 500
superpixels = np.random.randint(np.ones((640, 480)) * num_superpixels)

fastCompute 的fastCompute的類型與初始代碼類似（除了出於性能考慮使用元組和 32 位整數），但它不是最優的，因為它包含純 Python object 類型，並且不是很緊湊。 調整 output 類型應該會產生更快的代碼。

在二維 numpy 數組中有效地查找一組連續值的索引

問題描述

1 個解決方案

解決方案1
2 已采納 2021-04-04 01:21:16

在二維 numpy 數組中有效地查找一組連續值的索引

問題描述

1 個解決方案

解決方案1 2 已采納 2021-04-04 01:21:16

解決方案1
2 已采納 2021-04-04 01:21:16