如何在二维 numpy 数组中查找行？

Question

I have a 2D numpy array and I want to find boundary points of both horizontal and vertical lines .我有一个2D numpy 数组，我想找到水平线和垂直线的边界点。

gray_img = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0],
                     [0, 255, 255, 255, 255, 255, 255, 255, 0],
                     [0, 255, 255, 255, 255, 255, 255, 255, 0],
                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                     [0, 255, 255, 0, 0, 255, 255, 255, 255],
                     [0, 255, 255, 0, 0, 255, 255, 255, 255],
                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                     [0, 0, 0, 0, 0, 0, 0, 0, 0],
                     [0, 0, 0, 0, 0, 0, 0, 0, 0],
                     [0, 0, 0, 0, 0, 0, 0, 0, 0],
                     [0, 0, 0, 0, 0, 0, 0, 0, 0]])

desired_outcome = [ [[1,1],[10,1]],
                    [[1,2],[10,2]],
                    [[1,3],[2,3]], ...]

Here are the lines I want to find :这是我要查找的行：

Later, I want to remove the smaller lines to keep only those with more than 2 points distance.稍后，我想删除较小的线以仅保留那些距离超过 2 点的线。

Answer 1

vertical lines:垂直线：

m = np.diff(gray_img, 1, 0) # get discrete difference along the 0-axis
m = np.argwhere(m != 0)     # get indices where value is not zero 
m = m[np.lexsort(m.T)]      # sort indices first 1-column then 0-column
m[::2,0] += 1               #

output:输出：

[[ 1  1]
 [10  1]
 [ 1  2]
 [10  2]
 [ 1  3]
 [ 2  3]
 [ 1  4]
 [ 2  4]
 [ 1  5]
 [ 2  5]
 [ 6  5]
 [ 7  5]
 [ 1  6]
 [ 2  6]
 [ 6  6]
 [ 7  6]
 [ 1  7]
 [ 2  7]
 [ 6  7]
 [ 7  7]
 [ 6  8]
 [ 7  8]]

horizontal lines:水平线：

m = np.diff(gray_img, 1, 1, append=np.zeros((gray_img.shape[0], 1)))
m = np.argwhere(m != 0)
m[::2,1] += 1

output:输出：

[[ 1  1]
 [ 1  7]
 [ 2  1]
 [ 2  7]
 [ 3  1]
 [ 3  2]
 [ 4  1]
 [ 4  2]
 [ 5  1]
 [ 5  2]
 [ 6  1]
 [ 6  2]
 [ 6  5]
 [ 6  8]
 [ 7  1]
 [ 7  2]
 [ 7  5]
 [ 7  8]
 [ 8  1]
 [ 8  2]
 [ 9  1]
 [ 9  2]
 [10  1]
 [10  2]]

Answer 2

Here's an algorithm for horizontal lines based on prefix sums:这是基于前缀和的水平线算法：

# Do a prefix sum to get the lengths of each line.
gray_cumsum = np.cumsum(gray_img / 255, axis=1)
gray_cumsum[:, 1:] = gray_cumsum[:, 1:] * (gray_cumsum[:, 1:] != gray_cumsum[:, :-1])
# Reindex all the points so each line starts at 1.
start_num = gray_cumsum.copy()
a = start_num[:,1:-1] != 0
b = start_num[:,:-2] == 0
c = start_num[:,2:] != 0
start_num[:,1:-1] = start_num[:,1:-1] * np.logical_and(np.logical_and(a, b), c)
start_num[:, -1] = 0
start_num = np.maximum.accumulate(start_num, axis=1)
gray_cumsum = np.maximum(gray_cumsum - start_num, 0)
# Detect only the ends of each line.
gray_cumsum[:,:-1] = gray_cumsum[:,:-1] * (gray_cumsum[:,1:] == 0)
# Get the starting and endings points of each line.
end_points = np.stack(gray_cumsum.nonzero(), axis=-1)
lengths = gray_cumsum[gray_cumsum.nonzero()]
start_points = end_points.copy()
start_points[:, 1] = start_points[:, 1] - lengths
print(start_points)
print(end_points)

Just change the indexing to get vertical lines.只需更改索引即可获得垂直线。 You can use the lengths array to filter out which lines you want.您可以使用长度数组来过滤掉您想要的行。

Answer 3

EDIT: Collecting both Horizontal and Vertical lines, also I reduced some superfluous comparisons that I used in the first draft: when scanning there is one direction, thus only the right/bottom end coordinates have to be updated, while the left/top are constant, equal to the beginning of the scanning of the current line segment.编辑：收集水平线和垂直线，我还减少了我在第一稿中使用的一些多余的比较：扫描时有一个方向，因此只需要更新右/下端坐标，而左/上是恒定的，等于开始扫描当前线段。 There is still some superfluous code that can be compressed.还有一些多余的代码可以压缩。

EDIT2: Added the final formatting of the list as in the question. EDIT2：添加了问题中列表的最终格式。

It finds and lists first the horizontal lines, if the vertical have to come first as in your example output, then just put the second traversal on top.它首先查找并列出水平线，如果在您的示例输出中必须首先出现垂直线，那么只需将第二个遍历放在顶部。

    import numpy as np
                
    gray_img = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0],
                                     [0, 255, 255, 255, 255, 255, 255, 255, 0],
                                     [0, 255, 255, 255, 255, 255, 255, 255, 0],
                                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                                     [0, 255, 255, 0, 0, 255, 255, 255, 255],
                                     [0, 255, 255, 0, 0, 255, 255, 255, 255],
                                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                                     [0, 0, 0, 0, 0, 0, 0, 0, 0],
                                     [0, 0, 0, 0, 0, 0, 0, 0, 0],
                                     [0, 0, 0, 0, 0, 0, 0, 0, 0],
                                     [0, 0, 0, 0, 0, 0, 0, 0, 0]])
                
    bounds = []
    a = gray_img

    #SCANNING HORIZONTAL LINES
    for y in range(0,a.shape[0]):
      print("\nY=",y)  
      found = False #set to True when scanning a line
      left_x, right_x = 0,0
      top_y = y; bottom_y = y
      if a[y,0]==255:
        top_y = min(top_y, y)
        bottom_y = max(bottom_y,y)     
        found = True
      for x in range(0,a.shape[1]): #
        #while x < len(a[1]) or x!==0 #...
         if a[y,x]==255 and not found: #first item
           found = True
           right_x = x
           left_x = x
           #right_x = max(right_x, x)     
           #left_x = x # min(left_x, x)
           print("START",top_y, bottom_y, left_x, right_x)
         else:
           if a[y,x]==255 and found: #running line       
             right_x = max(right_x, x)     
             #left_x = min(left_x, x)
             #print(top_y, bottom_y, left_x, right_x)
         if a[y,x]==0 and found: #end of a running line
            bounds.append([top_y, left_x, bottom_y, right_x])
            print(a[y,x],end=",")
            found = False
      if found: #end of a running line matches the end of the dimension/line
         bounds.append([top_y, left_x, bottom_y, right_x])
         print(a[y,x],end=",")
         found = False                                       
    print("\n")
    print(bounds)
    print(a.shape)
    #print(f"LEN= {a.shape[0])}, {a.shape[1])}")  
    
    #SCANNING VERTICAL LINES
    for x in range(0,a.shape[1]):
      print("\nY=",y)  
      found = False #set to True when scanning a line
      left_x, right_x = x,x
      top_y = 0; bottom_y = 0
      if a[0,x]==255:
        top_y = min(top_y, y)
        bottom_y = max(bottom_y,y)     
        found = True
      for y in range(0,a.shape[0]): #
        #while x < len(a[1]) or x!==0 #...
         if a[y,x]==255 and not found: #first item
           found = True
           #right_x = max(right_x, x)     
           bottom_y = y       
           top_y = y
           left_x = x # min(left_x, x)
           print("START",top_y, bottom_y, left_x, right_x)
         else:
           if a[y,x]==255 and found: #running line       
             bottom_y = y #max(right_x, x)     
             #top_y = min(left_x, x)
             #print(top_y, bottom_y, left_x, right_x)
         if a[y,x]==0 and found: #end of a running line
            bounds.append([top_y, left_x, bottom_y, right_x])
            print(a[y,x],end=",")
            found = False
      if found: #end of a running line matches the end of the dimension/line
         bounds.append([top_y, left_x, bottom_y, right_x])
         print(a[y,x],end=",")
         found = False                                       
    print("\n")
    print(bounds)
    print(a.shape)   

# [[1, 1, 1, 7], [2, 1, 2, 7], [3, 1, 3, 2], [4, 1, 4, 2], [5, 1, 5, 2], [6, 1, 6, 2], [6, 5, 6, 8], [7, 1, 7, 2], [7, 5, 7, 8], [8, 1, 8, 2], [9, 1, 9, 2], [10, 1, 10, 2], [1, 1, 10, 1], [1, 2, 10, 2], [1, 3, 2, 3], [1, 4, 2, 4], [1, 5, 2, 5], [6, 5, 7, 5], [1, 6, 2, 6], [6, 6, 7, 6], [1, 7, 2, 7], [6, 7, 7, 7], [6, 8, 7, 8]]
#(15, 9)

#This list has to be additionally traversed in order to form [ [[1,1][1,7]], [...]] e.g.:

fm = []
for i in bounds:
  #i=[1,1,1,7] etc.
  f = [[i[0],i[1]],[i[2],i[3]]]
  fm.append(f)

print(fm) 

[[[1, 1], [1, 7]], [[2, 1], [2, 7]], [[3, 1], [3, 2]], [[4, 1], [4, 2]], [[5, 1], [5, 2]], [[6, 1], [6, 2]], [[6, 5], [6, 8]], [[7, 1], [7, 2]], [[7, 5], [7, 8]], [[8, 1], [8, 2]], [[9, 1], [9, 2]], [[10, 1], [10, 2]], [[1, 1], [10, 1]], [[1, 2], [10, 2]], [[1, 3], [2, 3]], [[1, 4], [2, 4]], [[1, 5], [2, 5]], [[6, 5], [7, 5]], [[1, 6], [2, 6]], [[6, 6], [7, 6]], [[1, 7], [2, 7]], [[6, 7], [7, 7]], [[6, 8], [7, 8]]]

Then you may traverse the result list and compute that distance (or you mean the lenght of the lines) and transfer only the longer lines to another list.然后您可以遍历结果列表并计算该距离（或者您的意思是线的长度）并仅将较长的线传输到另一个列表。

Answer 4

Is this what you want?这是你想要的吗？

import pandas as pd
import numpy as np

a = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0],
                     [0, 255, 255, 255, 255, 255, 255, 255, 0],
                     [0, 255, 255, 255, 255, 255, 255, 255, 0],
                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                     [0, 255, 255, 0, 0, 255, 255, 255, 255],
                     [0, 255, 255, 0, 0, 255, 255, 255, 255],
                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                     [0, 0, 0, 0, 0, 0, 0, 0, 0],
                     [0, 0, 0, 0, 0, 0, 0, 0, 0],
                     [0, 0, 0, 0, 0, 0, 0, 0, 0],
                     [0, 0, 0, 0, 0, 0, 0, 0, 0]])

a = a[a != 0]
list = a.tolist()
print(list)

如何在二维 numpy 数组中查找行？

问题描述

4 个解决方案

解决方案1
4 已采纳 2022-06-20 10:17:38

解决方案2
0 2022-06-20 03:16:34

解决方案3
0 2022-06-20 09:02:32

解决方案4
-2 2022-06-20 03:32:28

如何在二维 numpy 数组中查找行？

问题描述

4 个解决方案

解决方案1 4 已采纳 2022-06-20 10:17:38

解决方案2 0 2022-06-20 03:16:34

解决方案3 0 2022-06-20 09:02:32

解决方案4 -2 2022-06-20 03:32:28

解决方案1
4 已采纳 2022-06-20 10:17:38

解决方案2
0 2022-06-20 03:16:34

解决方案3
0 2022-06-20 09:02:32

解决方案4
-2 2022-06-20 03:32:28