简体   繁体   中英

How to detect all boxes for inputting letters in forms for a particular field?

It is required to recognize text from forms with boxes given for each character input.

I have tried using bounding box for each input and cropping that particular input, ie I can get all the boxes for inputting in 'Name' field. But when I try to detect individual boxes in the group of boxes, I am not able to do so and the opencv returns only one contour for all the boxes. The file referred in the for loop is a file containing coordinates of the bounding box. The cropped_img is the image which belongs to a single field's input(eg. Name).

Full form image完整形式的图像This is the image of the form.

裁剪图像 cropped image for each field

It contains many boxes for inputting characters. Here the number of the contours detected is always one. Why am I not able to detect all individual boxes? In short, I want all the individual boxes in the cropped_img.

Also, any other idea for approaching the task of form ocr is really appreciated!

for line in file.read().split("\n"):
        if len(line)==0:
            continue 
        region = list(map(int,line.split(' ')[:-1]))      
        index=line.split(' ')[-1] 
        text=''
        contentDict={}
        #uzn in format left, up, width, height
        region[2] = region[0]+region[2]
        region[3] = region[1]+region[3]
        region = tuple(region)
        cropped_img =  panimg[region[1]:region[3],region[0]:region[2]]

        index=index.replace('_', ' ')
        if index=='sign' or index=='picture' or index=='Dec sign':
            continue

        kernel = np.ones((50,50),np.uint8)
        gray = cv2.cvtColor(cropped_img, cv2.COLOR_BGR2GRAY)
        ret, threshold = cv2.threshold(gray,127,255,cv2.THRESH_BINARY)
        threshold = cv2.bitwise_not(threshold)   
        dilate = cv2.dilate(threshold,kernel,iterations = 1)
        ret, threshold = cv2.threshold(dilate,127,255,cv2.THRESH_BINARY)
        dilate = cv2.dilate(threshold,kernel,iterations = 1)
        contours, hierarchy = cv2.findContours(dilate,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
        contours.sort(key=lambda x:get_contour_precedence(x, panimg.shape[1]))


        print("Length of contours detected: ", len(contours))
        for j, ctr in enumerate(contours):
            # Get bounding box
            x, y, w, h = cv2.boundingRect(ctr)

            # Getting ROI

            roi = cropped_img[y:y+h, x:x+w]
            # show ROI
            cv2.imshow('segment no:'+str(j-1),roi)
            cv2.waitKey(0)

The content of file 'file' is as follows:

462 545 468 39 AO_Office
450 785 775 39 Last_Name
452 836 770 37 First_Name
451 885 772 39 Middle_Name
241 963 973 87 Abbreviation_Name

The expected output is contours for individual boxes for inputting a single letter for each field

I know I'm a bit late to the party :) but in case somebody would be looking for solution to this problem - I recently came up with a python package that deals with this exact problem.
I called it BoxDetect and after installing it through:

pip install boxdetect

You can try something like this:

from boxdetect import config

config.min_w, config.max_w = (20,50)
config.min_h, config.max_h = (20,50)
config.scaling_factors = [0.4]
config.dilation_iterations = 0
config.wh_ratio_range = (0.5, 2.0)
config.group_size_range = (1, 100)
config.horizontal_max_distance_multiplier = 2


from boxdetect.pipelines import get_boxes

image_path = "dumpster/m1nda.jpg"
rects, grouped_rects, org_image, output_image = get_boxes(image_path, config, plot=False)


import matplotlib.pyplot as plt

print("======================")
print("Individual boxes (green): ", rects)
print("======================")
print("Grouped boxes (red): ", grouped_rects)
print("======================")
plt.figure(figsize=(25,25))
plt.imshow(output_image)
plt.show()

It returns bounding rectangle coords of all the rectangle boxes, grouped boxes forming long entry fields and visualization on the form image:

Processing file:  dumpster/m1nda.jpg
======================
Individual boxes (green):  [[1153 1873   26   26]
 [1125 1873   24   27]
 [1098 1873   24   26]
 ...
 [ 558  551   42   28]
 [ 514  551   42   28]
 [ 468  551   42   28]]
======================
Grouped boxes (red):  [(468, 551, 457, 29), (424, 728, 47, 45), (608, 728, 31, 45), (698, 728, 33, 45), (864, 728, 31, 45), (1059, 728, 47, 45), (456, 792, 763, 29), (456, 842, 763, 28), (456, 891, 763, 29), (249, 969, 961, 28), (249, 1017, 962, 28), (700, 1064, 39, 32), (870, 1064, 41, 32), (376, 1124, 45, 45), (626, 1124, 29, 45), (750, 1124, 27, 45), (875, 1124, 41, 45), (1054, 1124, 28, 45), (507, 1188, 706, 29), (507, 1238, 706, 28), (507, 1287, 706, 29), (718, 1335, 36, 31), (856, 1335, 35, 31), (1008, 1335, 34, 32), (260, 1438, 51, 37), (344, 1438, 56, 37), (505, 1443, 98, 27), (371, 1530, 31, 31), (539, 1530, 31, 31), (486, 1636, 694, 28), (486, 1684, 694, 28), (486, 1731, 694, 29), (486, 1825, 694, 29), (486, 1873, 694, 28)]
======================

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM