如何檢測用於在特定字段的表單中輸入字母的所有框？

Question

需要從帶有為每個字符輸入給出的框的表單中識別文本。

我嘗試為每個輸入使用邊界框並裁剪該特定輸入，即我可以獲得所有用於在“名稱”字段中輸入的框。 但是當我嘗試檢測一組框中的單個框時，我無法這樣做，並且 opencv 只為所有框返回一個輪廓。 for 循環中引用的文件是一個包含邊界框坐標的文件。 cropped_img 是屬於單個字段輸入（例如名稱）的圖像。

完整形式的圖像這是表格的圖片。

每個字段的裁剪圖像

它包含許多用於輸入字符的框。 這里檢測到的輪廓數始終為 1。 為什么我無法檢測到所有單獨的盒子？ 簡而言之，我想要cropped_img 中的所有單個框。

此外，任何其他處理表單 ocr 任務的想法都非常感謝！

for line in file.read().split("\n"):
        if len(line)==0:
            continue 
        region = list(map(int,line.split(' ')[:-1]))      
        index=line.split(' ')[-1] 
        text=''
        contentDict={}
        #uzn in format left, up, width, height
        region[2] = region[0]+region[2]
        region[3] = region[1]+region[3]
        region = tuple(region)
        cropped_img =  panimg[region[1]:region[3],region[0]:region[2]]

        index=index.replace('_', ' ')
        if index=='sign' or index=='picture' or index=='Dec sign':
            continue

        kernel = np.ones((50,50),np.uint8)
        gray = cv2.cvtColor(cropped_img, cv2.COLOR_BGR2GRAY)
        ret, threshold = cv2.threshold(gray,127,255,cv2.THRESH_BINARY)
        threshold = cv2.bitwise_not(threshold)   
        dilate = cv2.dilate(threshold,kernel,iterations = 1)
        ret, threshold = cv2.threshold(dilate,127,255,cv2.THRESH_BINARY)
        dilate = cv2.dilate(threshold,kernel,iterations = 1)
        contours, hierarchy = cv2.findContours(dilate,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
        contours.sort(key=lambda x:get_contour_precedence(x, panimg.shape[1]))


        print("Length of contours detected: ", len(contours))
        for j, ctr in enumerate(contours):
            # Get bounding box
            x, y, w, h = cv2.boundingRect(ctr)

            # Getting ROI

            roi = cropped_img[y:y+h, x:x+w]
            # show ROI
            cv2.imshow('segment no:'+str(j-1),roi)
            cv2.waitKey(0)

文件'file'的內容如下：

462 545 468 39 AO_Office
450 785 775 39 Last_Name
452 836 770 37 First_Name
451 885 772 39 Middle_Name
241 963 973 87 Abbreviation_Name

預期輸出是單個框的輪廓，用於為每個字段輸入單個字母

Answer 1

我知道我參加聚會有點晚了 :) 但萬一有人會尋找這個問題的解決方案 - 我最近想出了一個處理這個確切問題的 python 包。
我稱它為BoxDetect ，安裝后通過：

pip install boxdetect

你可以嘗試這樣的事情：

from boxdetect import config

config.min_w, config.max_w = (20,50)
config.min_h, config.max_h = (20,50)
config.scaling_factors = [0.4]
config.dilation_iterations = 0
config.wh_ratio_range = (0.5, 2.0)
config.group_size_range = (1, 100)
config.horizontal_max_distance_multiplier = 2


from boxdetect.pipelines import get_boxes

image_path = "dumpster/m1nda.jpg"
rects, grouped_rects, org_image, output_image = get_boxes(image_path, config, plot=False)


import matplotlib.pyplot as plt

print("======================")
print("Individual boxes (green): ", rects)
print("======================")
print("Grouped boxes (red): ", grouped_rects)
print("======================")
plt.figure(figsize=(25,25))
plt.imshow(output_image)
plt.show()

它返回所有矩形框的邊界矩形坐標、形成長輸入字段的分組框以及表單圖像上的可視化：

Processing file:  dumpster/m1nda.jpg
======================
Individual boxes (green):  [[1153 1873   26   26]
 [1125 1873   24   27]
 [1098 1873   24   26]
 ...
 [ 558  551   42   28]
 [ 514  551   42   28]
 [ 468  551   42   28]]
======================
Grouped boxes (red):  [(468, 551, 457, 29), (424, 728, 47, 45), (608, 728, 31, 45), (698, 728, 33, 45), (864, 728, 31, 45), (1059, 728, 47, 45), (456, 792, 763, 29), (456, 842, 763, 28), (456, 891, 763, 29), (249, 969, 961, 28), (249, 1017, 962, 28), (700, 1064, 39, 32), (870, 1064, 41, 32), (376, 1124, 45, 45), (626, 1124, 29, 45), (750, 1124, 27, 45), (875, 1124, 41, 45), (1054, 1124, 28, 45), (507, 1188, 706, 29), (507, 1238, 706, 28), (507, 1287, 706, 29), (718, 1335, 36, 31), (856, 1335, 35, 31), (1008, 1335, 34, 32), (260, 1438, 51, 37), (344, 1438, 56, 37), (505, 1443, 98, 27), (371, 1530, 31, 31), (539, 1530, 31, 31), (486, 1636, 694, 28), (486, 1684, 694, 28), (486, 1731, 694, 29), (486, 1825, 694, 29), (486, 1873, 694, 28)]
======================

如何檢測用於在特定字段的表單中輸入字母的所有框？

問題描述

1 個解決方案

解決方案1
0 2020-06-07 21:52:21

如何檢測用於在特定字段的表單中輸入字母的所有框？

問題描述

1 個解決方案

解決方案1 0 2020-06-07 21:52:21

解決方案1
0 2020-06-07 21:52:21