简体   繁体   English

如何在二维 numpy 数组中查找行?

[英]How to find lines in a 2d numpy array?

I have a 2D numpy array and I want to find boundary points of both horizontal and vertical lines .我有一个2D numpy 数组,我想找到水平线和垂直线的边界点

gray_img = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0],
                     [0, 255, 255, 255, 255, 255, 255, 255, 0],
                     [0, 255, 255, 255, 255, 255, 255, 255, 0],
                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                     [0, 255, 255, 0, 0, 255, 255, 255, 255],
                     [0, 255, 255, 0, 0, 255, 255, 255, 255],
                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                     [0, 0, 0, 0, 0, 0, 0, 0, 0],
                     [0, 0, 0, 0, 0, 0, 0, 0, 0],
                     [0, 0, 0, 0, 0, 0, 0, 0, 0],
                     [0, 0, 0, 0, 0, 0, 0, 0, 0]])

desired_outcome = [ [[1,1],[10,1]],
                    [[1,2],[10,2]],
                    [[1,3],[2,3]], ...]

Here are the lines I want to find :这是我要查找的行

在此处输入图像描述 在此处输入图像描述

Later, I want to remove the smaller lines to keep only those with more than 2 points distance.稍后,我想删除较小的线以仅保留那些距离超过 2 点的线。

vertical lines:垂直线:

m = np.diff(gray_img, 1, 0) # get discrete difference along the 0-axis
m = np.argwhere(m != 0)     # get indices where value is not zero 
m = m[np.lexsort(m.T)]      # sort indices first 1-column then 0-column
m[::2,0] += 1               #

output:输出:

[[ 1  1]
 [10  1]
 [ 1  2]
 [10  2]
 [ 1  3]
 [ 2  3]
 [ 1  4]
 [ 2  4]
 [ 1  5]
 [ 2  5]
 [ 6  5]
 [ 7  5]
 [ 1  6]
 [ 2  6]
 [ 6  6]
 [ 7  6]
 [ 1  7]
 [ 2  7]
 [ 6  7]
 [ 7  7]
 [ 6  8]
 [ 7  8]]

horizontal lines:水平线:

m = np.diff(gray_img, 1, 1, append=np.zeros((gray_img.shape[0], 1)))
m = np.argwhere(m != 0)
m[::2,1] += 1

output:输出:

[[ 1  1]
 [ 1  7]
 [ 2  1]
 [ 2  7]
 [ 3  1]
 [ 3  2]
 [ 4  1]
 [ 4  2]
 [ 5  1]
 [ 5  2]
 [ 6  1]
 [ 6  2]
 [ 6  5]
 [ 6  8]
 [ 7  1]
 [ 7  2]
 [ 7  5]
 [ 7  8]
 [ 8  1]
 [ 8  2]
 [ 9  1]
 [ 9  2]
 [10  1]
 [10  2]]

Here's an algorithm for horizontal lines based on prefix sums:这是基于前缀和的水平线算法:

# Do a prefix sum to get the lengths of each line.
gray_cumsum = np.cumsum(gray_img / 255, axis=1)
gray_cumsum[:, 1:] = gray_cumsum[:, 1:] * (gray_cumsum[:, 1:] != gray_cumsum[:, :-1])
# Reindex all the points so each line starts at 1.
start_num = gray_cumsum.copy()
a = start_num[:,1:-1] != 0
b = start_num[:,:-2] == 0
c = start_num[:,2:] != 0
start_num[:,1:-1] = start_num[:,1:-1] * np.logical_and(np.logical_and(a, b), c)
start_num[:, -1] = 0
start_num = np.maximum.accumulate(start_num, axis=1)
gray_cumsum = np.maximum(gray_cumsum - start_num, 0)
# Detect only the ends of each line.
gray_cumsum[:,:-1] = gray_cumsum[:,:-1] * (gray_cumsum[:,1:] == 0)
# Get the starting and endings points of each line.
end_points = np.stack(gray_cumsum.nonzero(), axis=-1)
lengths = gray_cumsum[gray_cumsum.nonzero()]
start_points = end_points.copy()
start_points[:, 1] = start_points[:, 1] - lengths
print(start_points)
print(end_points)

Just change the indexing to get vertical lines.只需更改索引即可获得垂直线。 You can use the lengths array to filter out which lines you want.您可以使用长度数组来过滤掉您想要的行。

EDIT: Collecting both Horizontal and Vertical lines, also I reduced some superfluous comparisons that I used in the first draft: when scanning there is one direction, thus only the right/bottom end coordinates have to be updated, while the left/top are constant, equal to the beginning of the scanning of the current line segment.编辑:收集水平线和垂直线,我还减少了我在第一稿中使用的一些多余的比较:扫描时有一个方向,因此只需要更新右/下端坐标,而左/上是​​恒定的,等于开始扫描当前线段。 There is still some superfluous code that can be compressed.还有一些多余的代码可以压缩。

EDIT2: Added the final formatting of the list as in the question. EDIT2:添加了问题中列表的最终格式。

It finds and lists first the horizontal lines, if the vertical have to come first as in your example output, then just put the second traversal on top.它首先查找并列出水平线,如果在您的示例输出中必须首先出现垂直线,那么只需将第二个遍历放在顶部。

    import numpy as np
                
    gray_img = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0],
                                     [0, 255, 255, 255, 255, 255, 255, 255, 0],
                                     [0, 255, 255, 255, 255, 255, 255, 255, 0],
                                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                                     [0, 255, 255, 0, 0, 255, 255, 255, 255],
                                     [0, 255, 255, 0, 0, 255, 255, 255, 255],
                                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                                     [0, 0, 0, 0, 0, 0, 0, 0, 0],
                                     [0, 0, 0, 0, 0, 0, 0, 0, 0],
                                     [0, 0, 0, 0, 0, 0, 0, 0, 0],
                                     [0, 0, 0, 0, 0, 0, 0, 0, 0]])
                
    bounds = []
    a = gray_img

    #SCANNING HORIZONTAL LINES
    for y in range(0,a.shape[0]):
      print("\nY=",y)  
      found = False #set to True when scanning a line
      left_x, right_x = 0,0
      top_y = y; bottom_y = y
      if a[y,0]==255:
        top_y = min(top_y, y)
        bottom_y = max(bottom_y,y)     
        found = True
      for x in range(0,a.shape[1]): #
        #while x < len(a[1]) or x!==0 #...
         if a[y,x]==255 and not found: #first item
           found = True
           right_x = x
           left_x = x
           #right_x = max(right_x, x)     
           #left_x = x # min(left_x, x)
           print("START",top_y, bottom_y, left_x, right_x)
         else:
           if a[y,x]==255 and found: #running line       
             right_x = max(right_x, x)     
             #left_x = min(left_x, x)
             #print(top_y, bottom_y, left_x, right_x)
         if a[y,x]==0 and found: #end of a running line
            bounds.append([top_y, left_x, bottom_y, right_x])
            print(a[y,x],end=",")
            found = False
      if found: #end of a running line matches the end of the dimension/line
         bounds.append([top_y, left_x, bottom_y, right_x])
         print(a[y,x],end=",")
         found = False                                       
    print("\n")
    print(bounds)
    print(a.shape)
    #print(f"LEN= {a.shape[0])}, {a.shape[1])}")  
    
    #SCANNING VERTICAL LINES
    for x in range(0,a.shape[1]):
      print("\nY=",y)  
      found = False #set to True when scanning a line
      left_x, right_x = x,x
      top_y = 0; bottom_y = 0
      if a[0,x]==255:
        top_y = min(top_y, y)
        bottom_y = max(bottom_y,y)     
        found = True
      for y in range(0,a.shape[0]): #
        #while x < len(a[1]) or x!==0 #...
         if a[y,x]==255 and not found: #first item
           found = True
           #right_x = max(right_x, x)     
           bottom_y = y       
           top_y = y
           left_x = x # min(left_x, x)
           print("START",top_y, bottom_y, left_x, right_x)
         else:
           if a[y,x]==255 and found: #running line       
             bottom_y = y #max(right_x, x)     
             #top_y = min(left_x, x)
             #print(top_y, bottom_y, left_x, right_x)
         if a[y,x]==0 and found: #end of a running line
            bounds.append([top_y, left_x, bottom_y, right_x])
            print(a[y,x],end=",")
            found = False
      if found: #end of a running line matches the end of the dimension/line
         bounds.append([top_y, left_x, bottom_y, right_x])
         print(a[y,x],end=",")
         found = False                                       
    print("\n")
    print(bounds)
    print(a.shape)   

# [[1, 1, 1, 7], [2, 1, 2, 7], [3, 1, 3, 2], [4, 1, 4, 2], [5, 1, 5, 2], [6, 1, 6, 2], [6, 5, 6, 8], [7, 1, 7, 2], [7, 5, 7, 8], [8, 1, 8, 2], [9, 1, 9, 2], [10, 1, 10, 2], [1, 1, 10, 1], [1, 2, 10, 2], [1, 3, 2, 3], [1, 4, 2, 4], [1, 5, 2, 5], [6, 5, 7, 5], [1, 6, 2, 6], [6, 6, 7, 6], [1, 7, 2, 7], [6, 7, 7, 7], [6, 8, 7, 8]]
#(15, 9)

#This list has to be additionally traversed in order to form [ [[1,1][1,7]], [...]] e.g.:

fm = []
for i in bounds:
  #i=[1,1,1,7] etc.
  f = [[i[0],i[1]],[i[2],i[3]]]
  fm.append(f)

print(fm) 

[[[1, 1], [1, 7]], [[2, 1], [2, 7]], [[3, 1], [3, 2]], [[4, 1], [4, 2]], [[5, 1], [5, 2]], [[6, 1], [6, 2]], [[6, 5], [6, 8]], [[7, 1], [7, 2]], [[7, 5], [7, 8]], [[8, 1], [8, 2]], [[9, 1], [9, 2]], [[10, 1], [10, 2]], [[1, 1], [10, 1]], [[1, 2], [10, 2]], [[1, 3], [2, 3]], [[1, 4], [2, 4]], [[1, 5], [2, 5]], [[6, 5], [7, 5]], [[1, 6], [2, 6]], [[6, 6], [7, 6]], [[1, 7], [2, 7]], [[6, 7], [7, 7]], [[6, 8], [7, 8]]]

Then you may traverse the result list and compute that distance (or you mean the lenght of the lines) and transfer only the longer lines to another list.然后您可以遍历结果列表并计算该距离(或者您的意思是线的长度)并仅将较长的线传输到另一个列表。

Is this what you want?这是你想要的吗?

import pandas as pd
import numpy as np

a = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0],
                     [0, 255, 255, 255, 255, 255, 255, 255, 0],
                     [0, 255, 255, 255, 255, 255, 255, 255, 0],
                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                     [0, 255, 255, 0, 0, 255, 255, 255, 255],
                     [0, 255, 255, 0, 0, 255, 255, 255, 255],
                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                     [0, 255, 255, 0, 0, 0, 0, 0, 0],
                     [0, 0, 0, 0, 0, 0, 0, 0, 0],
                     [0, 0, 0, 0, 0, 0, 0, 0, 0],
                     [0, 0, 0, 0, 0, 0, 0, 0, 0],
                     [0, 0, 0, 0, 0, 0, 0, 0, 0]])

a = a[a != 0]
list = a.tolist()
print(list)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM