[英]How do I extract a 2D NumPy sub-array from a 2D NumPy array based on patterns?
我有一个 2D NumPy 数组,如下所示:
Array=
[
[0,0,0,0,0,0,0,2,2,2],
[0,0,0,0,0,0,0,2,2,2].
[0,0,1,1,1,0,0,2,2,2],
[0,0,1,1,1,0,0,2,2,2],
[0,0,1,1,1,0,0,1,1,1],
[0,0,0,0,0,0,0,1,1,1]
]
我需要将非零元素数组显示为:
Array1:
[
[1,1,1],
[1,1,1],
[1,1,1]
]
Array2:
[
[2,2,2],
[2,2,2],
[2,2,2],
[2,2,2]
]
Array3:
[
[1,1,1],
[1,1,1]
]
有人可以帮我解决我可以使用什么逻辑来实现以下输出吗? 我不能使用固定索引(如 array[a:b, c:d]),因为我创建的逻辑应该能够适用于任何具有类似模式的 NumPy 数组。
这使用scipy.ndimage.label
递归识别断开连接的子数组。
import numpy as np
from scipy.ndimage import label
array = np.array(
[[0,0,0,0,0,0,0,2,2,2,3,3,3],
[0,0,0,0,0,0,0,2,2,2,0,0,1],
[0,0,1,1,1,0,0,2,2,2,0,2,1],
[0,0,1,1,1,0,0,2,2,2,0,2,0],
[0,0,1,1,1,0,0,1,1,1,0,0,0],
[0,0,0,0,0,0,0,1,1,1,0,0,0]])
# initialize list to collect sub-arrays
arr_list = []
def append_subarrays(arr, val, val_0):
'''
arr : 2D array
val : the value used for filtering
val_0 : the original value, which we want to preserve
'''
# remove everything that's not the current val
arr[arr != val] = 0
if 0 in arr: # <-- not a single rectangle yet
# get relevant indices as well as their minima and maxima
x_ind, y_ind = np.where(arr != 0)
min_x, max_x, min_y, max_y = min(x_ind), max(x_ind) + 1, min(y_ind), max(y_ind) + 1
# cut subarray (everything corresponding to val)
arr = arr[min_x:max_x, min_y:max_y]
# use the label function to assign different values to disconnected regions
labeled_arr = label(arr)[0]
# recursively apply append_subarrays to each disconnected region
for sub_val in np.unique(labeled_arr[labeled_arr != 0]):
append_subarrays(labeled_arr.copy(), sub_val, val_0)
else: # <-- we only have a single rectangle left ==> append
arr_list.append(arr * val_0)
for i in np.unique(array[array > 0]):
append_subarrays(array.copy(), i, i)
for arr in arr_list:
print(arr, end='\n'*2)
输出(注意:修改后的示例数组):
[[1]
[1]]
[[1 1 1]
[1 1 1]
[1 1 1]]
[[1 1 1]
[1 1 1]]
[[2 2 2]
[2 2 2]
[2 2 2]
[2 2 2]]
[[2]
[2]]
[[3 3 3]]
这听起来像是一个洪水填充问题,所以skimage.measure.label
是一个很好的方法:
Array=np.array([[0,0,0,0,0,0,0,2,2,2],
[0,0,0,0,0,0,0,2,2,2],
[0,0,1,1,1,0,0,2,2,2],
[0,0,1,1,1,0,0,2,2,2],
[0,0,1,1,1,0,0,1,1,1],
[0,0,0,0,0,0,0,1,1,1]
])
from skimage.measure import label
labels = label(Array, connectivity=1)
for label in range(1, labels.max()+1):
xs, ys = np.where(labels==label)
shape = (len(np.unique(xs)), len(np.unique(ys)))
print(Array[xs, ys].reshape(shape))
输出:
[[2 2 2]
[2 2 2]
[2 2 2]
[2 2 2]]
[[1 1 1]
[1 1 1]
[1 1 1]]
[[1 1 1]
[1 1 1]]
startRowIndex = 0 #indexes of sub-arrays
endRowIndex = 0
startColumnIndex = 0
endColumnIndex = 0
tmpI = 0 #for iterating inside the i,j loops
tmpJ = 0
value = 0 #which number we are looking for in array
for i in range(array.shape[0]): #array.shape[0] says how many rows, shape[1] says how many columns
for j in range(array[i].size): #for all elements in a row
if(array[i,j] != 0): #if the element is different than 0
startRowIndex = i
startColumnIndex = j
tmpI = i
tmpJ = j #you cannot change the looping indexes so create tmp indexes
value = array[i,j] #save what number will be sub-array (for example 2)
while(array[tmpI,tmpJ] != 0 and array[tmpI,tmpJ] == value ): #iterate over column numbers
tmpJ+=1
if tmpJ == array.shape[1]: #if you reached end of the array (that is end of the column)
break
#if you left the array then it means you are on index which is not zero,
#so the previous where zero, but displaying array like this a[start:stop]
#will take the values from <start; stop) (stop is excluded)
endColumnIndex = tmpJ
tmpI = i
tmpJ = j
while(array[tmpI,tmpJ] != 0 and array[tmpI,tmpJ] == value): #iterate over row numbers
tmpI += 1
if tmpI == array.shape[0]: #if you reached end of the array
break
#if you left the array then it means you are on index which is not zero,
#so the previous where zero
endRowIndex = tmpI
print(array[startRowIndex:endRowIndex, startColumnIndex:endColumnIndex])
#change array to zero with already used elements
array[startRowIndex:endRowIndex, startColumnIndex:endColumnIndex] = 0
这是一种蛮力,但可以按照您想要的方式工作。 这种方法不使用除 numpy 以外的任何外部库
这是我的纯 Python(无 NumPy)解决方案。 我利用了连续区域总是矩形的事实。
算法从左上角到右下角扫描; 当它找到一个区域的角落时,它会扫描以找到右上角和左下角。 填充字典skip
,以便以后的扫描可以水平跳过已经找到的任何矩形。
对于具有 n 行和 m 列的网格,时间复杂度为 O(nm),这对于该问题是最佳的。
def find_rectangles(grid):
width, height = len(grid[0]), len(grid)
skip = dict()
for y in range(height):
x = 0
while x < width:
if (x, y) in skip:
x = skip[x, y]
elif not grid[y][x]:
x += 1
else:
v = grid[y][x]
x2 = x + 1
while x2 < width and grid[y][x2] == v:
x2 += 1
y2 = y + 1
while y2 < height and grid[y2][x] == v:
skip[x, y2] = x2
y2 += 1
yield [ row[x:x2] for row in grid[y:y2] ]
x = x2
例子:
>>> for r in find_rectangles(grid1): # example from the question
... print(r)
...
[[2, 2, 2], [2, 2, 2], [2, 2, 2], [2, 2, 2]]
[[1, 1, 1], [1, 1, 1], [1, 1, 1]]
[[1, 1, 1], [1, 1, 1]]
>>> for r in find_rectangles(grid2): # example from mcsoini's answer
... print(r)
...
[[2, 2, 2], [2, 2, 2], [2, 2, 2], [2, 2, 2]]
[[3, 3, 3]]
[[1], [1]]
[[1, 1, 1], [1, 1, 1], [1, 1, 1]]
[[2], [2]]
[[1, 1, 1], [1, 1, 1]]
我们可以使用scipy.ndimage.label
和scipy.ndimage.find_objects
来做到这scipy.ndimage.find_objects
:
from scipy.ndimage import label,find_objects
Array = np.array(Array)
[Array[j][i] for j in find_objects(*label(Array)) for i in find_objects(Array[j])]
# [array([[1, 1, 1],
# [1, 1, 1]]), array([[2, 2, 2],
# [2, 2, 2],
# [2, 2, 2],
# [2, 2, 2]]), array([[1, 1, 1],
# [1, 1, 1],
# [1, 1, 1]])]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.