如何检测用于在特定字段的表单中输入字母的所有框？

Question

需要从带有为每个字符输入给出的框的表单中识别文本。

我尝试为每个输入使用边界框并裁剪该特定输入，即我可以获得所有用于在“名称”字段中输入的框。 但是当我尝试检测一组框中的单个框时，我无法这样做，并且 opencv 只为所有框返回一个轮廓。 for 循环中引用的文件是一个包含边界框坐标的文件。 cropped_img 是属于单个字段输入（例如名称）的图像。

完整形式的图像这是表格的图片。

每个字段的裁剪图像

它包含许多用于输入字符的框。 这里检测到的轮廓数始终为 1。 为什么我无法检测到所有单独的盒子？ 简而言之，我想要cropped_img 中的所有单个框。

此外，任何其他处理表单 ocr 任务的想法都非常感谢！

for line in file.read().split("\n"):
        if len(line)==0:
            continue 
        region = list(map(int,line.split(' ')[:-1]))      
        index=line.split(' ')[-1] 
        text=''
        contentDict={}
        #uzn in format left, up, width, height
        region[2] = region[0]+region[2]
        region[3] = region[1]+region[3]
        region = tuple(region)
        cropped_img =  panimg[region[1]:region[3],region[0]:region[2]]

        index=index.replace('_', ' ')
        if index=='sign' or index=='picture' or index=='Dec sign':
            continue

        kernel = np.ones((50,50),np.uint8)
        gray = cv2.cvtColor(cropped_img, cv2.COLOR_BGR2GRAY)
        ret, threshold = cv2.threshold(gray,127,255,cv2.THRESH_BINARY)
        threshold = cv2.bitwise_not(threshold)   
        dilate = cv2.dilate(threshold,kernel,iterations = 1)
        ret, threshold = cv2.threshold(dilate,127,255,cv2.THRESH_BINARY)
        dilate = cv2.dilate(threshold,kernel,iterations = 1)
        contours, hierarchy = cv2.findContours(dilate,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
        contours.sort(key=lambda x:get_contour_precedence(x, panimg.shape[1]))


        print("Length of contours detected: ", len(contours))
        for j, ctr in enumerate(contours):
            # Get bounding box
            x, y, w, h = cv2.boundingRect(ctr)

            # Getting ROI

            roi = cropped_img[y:y+h, x:x+w]
            # show ROI
            cv2.imshow('segment no:'+str(j-1),roi)
            cv2.waitKey(0)

文件'file'的内容如下：

462 545 468 39 AO_Office
450 785 775 39 Last_Name
452 836 770 37 First_Name
451 885 772 39 Middle_Name
241 963 973 87 Abbreviation_Name

预期输出是单个框的轮廓，用于为每个字段输入单个字母

Answer 1

我知道我参加聚会有点晚了 :) 但万一有人会寻找这个问题的解决方案 - 我最近想出了一个处理这个确切问题的 python 包。
我称它为BoxDetect ，安装后通过：

pip install boxdetect

你可以尝试这样的事情：

from boxdetect import config

config.min_w, config.max_w = (20,50)
config.min_h, config.max_h = (20,50)
config.scaling_factors = [0.4]
config.dilation_iterations = 0
config.wh_ratio_range = (0.5, 2.0)
config.group_size_range = (1, 100)
config.horizontal_max_distance_multiplier = 2


from boxdetect.pipelines import get_boxes

image_path = "dumpster/m1nda.jpg"
rects, grouped_rects, org_image, output_image = get_boxes(image_path, config, plot=False)


import matplotlib.pyplot as plt

print("======================")
print("Individual boxes (green): ", rects)
print("======================")
print("Grouped boxes (red): ", grouped_rects)
print("======================")
plt.figure(figsize=(25,25))
plt.imshow(output_image)
plt.show()

它返回所有矩形框的边界矩形坐标、形成长输入字段的分组框以及表单图像上的可视化：

Processing file:  dumpster/m1nda.jpg
======================
Individual boxes (green):  [[1153 1873   26   26]
 [1125 1873   24   27]
 [1098 1873   24   26]
 ...
 [ 558  551   42   28]
 [ 514  551   42   28]
 [ 468  551   42   28]]
======================
Grouped boxes (red):  [(468, 551, 457, 29), (424, 728, 47, 45), (608, 728, 31, 45), (698, 728, 33, 45), (864, 728, 31, 45), (1059, 728, 47, 45), (456, 792, 763, 29), (456, 842, 763, 28), (456, 891, 763, 29), (249, 969, 961, 28), (249, 1017, 962, 28), (700, 1064, 39, 32), (870, 1064, 41, 32), (376, 1124, 45, 45), (626, 1124, 29, 45), (750, 1124, 27, 45), (875, 1124, 41, 45), (1054, 1124, 28, 45), (507, 1188, 706, 29), (507, 1238, 706, 28), (507, 1287, 706, 29), (718, 1335, 36, 31), (856, 1335, 35, 31), (1008, 1335, 34, 32), (260, 1438, 51, 37), (344, 1438, 56, 37), (505, 1443, 98, 27), (371, 1530, 31, 31), (539, 1530, 31, 31), (486, 1636, 694, 28), (486, 1684, 694, 28), (486, 1731, 694, 29), (486, 1825, 694, 29), (486, 1873, 694, 28)]
======================

如何检测用于在特定字段的表单中输入字母的所有框？

问题描述

1 个解决方案

解决方案1
0 2020-06-07 21:52:21

如何检测用于在特定字段的表单中输入字母的所有框？

问题描述

1 个解决方案

解决方案1 0 2020-06-07 21:52:21

解决方案1
0 2020-06-07 21:52:21