如何基于模式从 2D NumPy 数组中提取 2D NumPy 子数组？

Question

我有一个 2D NumPy 数组，如下所示：


Array=
[
[0,0,0,0,0,0,0,2,2,2],
[0,0,0,0,0,0,0,2,2,2].
[0,0,1,1,1,0,0,2,2,2],
[0,0,1,1,1,0,0,2,2,2],
[0,0,1,1,1,0,0,1,1,1],
[0,0,0,0,0,0,0,1,1,1]
]

我需要将非零元素数组显示为：

Array1:
[
[1,1,1],
[1,1,1],
[1,1,1]
]

Array2:
[
[2,2,2],
[2,2,2],
[2,2,2],
[2,2,2]
]

Array3:
[
[1,1,1],
[1,1,1]
]

有人可以帮我解决我可以使用什么逻辑来实现以下输出吗？ 我不能使用固定索引（如 array[a:b, c:d]），因为我创建的逻辑应该能够适用于任何具有类似模式的 NumPy 数组。

Answer 1

这使用scipy.ndimage.label递归识别断开连接的子数组。

import numpy as np
from scipy.ndimage import label

array = np.array(
    [[0,0,0,0,0,0,0,2,2,2,3,3,3],
     [0,0,0,0,0,0,0,2,2,2,0,0,1],
     [0,0,1,1,1,0,0,2,2,2,0,2,1],
     [0,0,1,1,1,0,0,2,2,2,0,2,0],
     [0,0,1,1,1,0,0,1,1,1,0,0,0],
     [0,0,0,0,0,0,0,1,1,1,0,0,0]])

# initialize list to collect sub-arrays
arr_list = []

def append_subarrays(arr, val, val_0):
    '''
    arr : 2D array
    val : the value used for filtering
    val_0 : the original value, which we want to preserve
    '''

    # remove everything that's not the current val
    arr[arr != val] = 0
    if 0 in arr:  # <-- not a single rectangle yet
        # get relevant indices as well as their minima and maxima
        x_ind, y_ind = np.where(arr != 0)
        min_x, max_x, min_y, max_y = min(x_ind), max(x_ind) + 1, min(y_ind), max(y_ind) + 1
        # cut subarray (everything corresponding to val)
        arr = arr[min_x:max_x, min_y:max_y]
        # use the label function to assign different values to disconnected regions
        labeled_arr = label(arr)[0]
        # recursively apply append_subarrays to each disconnected region 
        for sub_val in np.unique(labeled_arr[labeled_arr != 0]):
            append_subarrays(labeled_arr.copy(), sub_val, val_0)

    else:  # <-- we only have a single rectangle left ==> append
        arr_list.append(arr * val_0)

for i in np.unique(array[array > 0]):
    append_subarrays(array.copy(), i, i)

for arr in arr_list:
    print(arr, end='\n'*2)

输出（注意：修改后的示例数组）：

[[1]
 [1]]

[[1 1 1]
 [1 1 1]
 [1 1 1]]

[[1 1 1]
 [1 1 1]]

[[2 2 2]
 [2 2 2]
 [2 2 2]
 [2 2 2]]

[[2]
 [2]]

[[3 3 3]]

Answer 2

这听起来像是一个洪水填充问题，所以skimage.measure.label是一个很好的方法：

Array=np.array([[0,0,0,0,0,0,0,2,2,2],
                [0,0,0,0,0,0,0,2,2,2],
                [0,0,1,1,1,0,0,2,2,2],
                [0,0,1,1,1,0,0,2,2,2],
                [0,0,1,1,1,0,0,1,1,1],
                [0,0,0,0,0,0,0,1,1,1]
                ])

from skimage.measure import label
labels = label(Array, connectivity=1)

for label in range(1, labels.max()+1):
    xs, ys = np.where(labels==label)
    shape = (len(np.unique(xs)), len(np.unique(ys)))

    print(Array[xs, ys].reshape(shape))

输出：

[[2 2 2]
 [2 2 2]
 [2 2 2]
 [2 2 2]]
[[1 1 1]
 [1 1 1]
 [1 1 1]]
[[1 1 1]
 [1 1 1]]

Answer 3

startRowIndex = 0 #indexes of sub-arrays
endRowIndex = 0
startColumnIndex = 0
endColumnIndex = 0

tmpI = 0 #for iterating inside the i,j loops
tmpJ = 0
value = 0 #which number we are looking for in array
for i in range(array.shape[0]): #array.shape[0] says how many rows, shape[1] says how many columns
    for j in range(array[i].size): #for all elements in a row
        if(array[i,j] != 0): #if the element is different than 0
            startRowIndex = i
            startColumnIndex = j
            tmpI = i
            tmpJ = j #you cannot change the looping indexes so create tmp indexes
            value = array[i,j] #save what number will be sub-array (for example 2)
            while(array[tmpI,tmpJ] != 0 and array[tmpI,tmpJ] == value ): #iterate over column numbers
                tmpJ+=1
                if tmpJ == array.shape[1]: #if you reached end of the array (that is end of the column)
                    break

            #if you left the array then it means you are on index which is not zero,
            #so the previous where zero, but displaying array like this a[start:stop]
            #will take the values from <start; stop) (stop is excluded)
            endColumnIndex = tmpJ 
            tmpI = i
            tmpJ = j

            while(array[tmpI,tmpJ] != 0 and array[tmpI,tmpJ] == value): #iterate over row numbers
                tmpI += 1
                if tmpI == array.shape[0]: #if you reached end of the array
                    break
            #if you left the array then it means you are on index which is not zero,
            #so the previous where zero 
            endRowIndex = tmpI
            print(array[startRowIndex:endRowIndex, startColumnIndex:endColumnIndex])
            #change array to zero with already used elements
            array[startRowIndex:endRowIndex, startColumnIndex:endColumnIndex] = 0

这是一种蛮力，但可以按照您想要的方式工作。 这种方法不使用除 numpy 以外的任何外部库

Answer 4

这是我的纯 Python（无 NumPy）解决方案。 我利用了连续区域总是矩形的事实。

算法从左上角到右下角扫描； 当它找到一个区域的角落时，它会扫描以找到右上角和左下角。 填充字典skip ，以便以后的扫描可以水平跳过已经找到的任何矩形。

对于具有 n 行和 m 列的网格，时间复杂度为 O(nm)，这对于该问题是最佳的。

def find_rectangles(grid):
    width, height = len(grid[0]), len(grid)

    skip = dict()

    for y in range(height):
        x = 0
        while x < width:
            if (x, y) in skip:
                x = skip[x, y]
            elif not grid[y][x]:
                x += 1
            else:
                v = grid[y][x]

                x2 = x + 1
                while x2 < width and grid[y][x2] == v:
                    x2 += 1

                y2 = y + 1
                while y2 < height and grid[y2][x] == v:
                    skip[x, y2] = x2
                    y2 += 1

                yield [ row[x:x2] for row in grid[y:y2] ]

                x = x2

例子：

>>> for r in find_rectangles(grid1): # example from the question
...     print(r)
...
[[2, 2, 2], [2, 2, 2], [2, 2, 2], [2, 2, 2]]
[[1, 1, 1], [1, 1, 1], [1, 1, 1]]
[[1, 1, 1], [1, 1, 1]]
>>> for r in find_rectangles(grid2): # example from mcsoini's answer
...     print(r)
...
[[2, 2, 2], [2, 2, 2], [2, 2, 2], [2, 2, 2]]
[[3, 3, 3]]
[[1], [1]]
[[1, 1, 1], [1, 1, 1], [1, 1, 1]]
[[2], [2]]
[[1, 1, 1], [1, 1, 1]]

Answer 5

我们可以使用scipy.ndimage.label和scipy.ndimage.find_objects来做到这scipy.ndimage.find_objects ：

from scipy.ndimage import label,find_objects
Array = np.array(Array)
[Array[j][i] for j in find_objects(*label(Array)) for i in find_objects(Array[j])]
# [array([[1, 1, 1],
#        [1, 1, 1]]), array([[2, 2, 2],
#        [2, 2, 2],
#        [2, 2, 2],
#        [2, 2, 2]]), array([[1, 1, 1],
#        [1, 1, 1],
#        [1, 1, 1]])]

如何基于模式从 2D NumPy 数组中提取 2D NumPy 子数组？

问题描述

5 个解决方案

解决方案1
2 2019-11-27 19:12:57

解决方案2
2 2019-11-27 19:20:40

解决方案3
2 2019-11-27 19:21:39

解决方案4
2 2019-11-27 21:48:12

解决方案5
1 2019-11-27 19:50:15

如何基于模式从 2D NumPy 数组中提取 2D NumPy 子数组？

问题描述

5 个解决方案

解决方案1 2 2019-11-27 19:12:57

解决方案2 2 2019-11-27 19:20:40

解决方案3 2 2019-11-27 19:21:39

解决方案4 2 2019-11-27 21:48:12

解决方案5 1 2019-11-27 19:50:15

解决方案1
2 2019-11-27 19:12:57

解决方案2
2 2019-11-27 19:20:40

解决方案3
2 2019-11-27 19:21:39

解决方案4
2 2019-11-27 21:48:12

解决方案5
1 2019-11-27 19:50:15