简体   繁体   English

OCR - 如何使用 python 识别方框内的数字?

[英]OCR - How to recognize numbers inside square boxes using python?

one problem with optical character recognition (ocr) is it can't recognize numbers properly when numbers are inside square boxes.光学字符识别 (ocr) 的一个问题是,当数字位于方框内时,它无法正确识别数字。 one failure example with tesseract is discussed here : Tesseract - How can I recognize numbers in box?此处讨论了 tesseract 的一个失败示例: Tesseract - How can I identify numbers in box? i was testing with paddleocr here : https://www.paddlepaddle.org.cn/hub/scene/ocr you can quickly try that api too,,for this input image :我在这里用 paddleocr 进行测试: https ://www.paddlepaddle.org.cn/hub/scene/ocr 你也可以快速尝试这个 api,对于这个输入图像:

在此处输入图像描述 it returns nothing..它什么也没返回。。

again when i try image like this :当我再次尝试这样的图像时: 在此处输入图像描述

it returns all the numbers successfully.most of the times these number recognition(both printed and handwritten) failing when they are inside square boxes.for recognizing numbers inside square boxes we need to convert these so called numbers in box image into numbers in image by removing all the square boxes.它成功返回所有数字。大多数情况下,这些数字识别(打印和手写)在正方形框内时失败。为了识别正方形框内的数字,我们需要将这些所谓的盒子图像中的数字转换为图像中的数字删除所有方形框。 i have some images like below :我有一些如下图:

在此处输入图像描述 在此处输入图像描述

在此处输入图像描述 在此处输入图像描述

see, the full square box outside numbers are not fully visible,,only some part of the square boxes are visible.i want to convert these images into image where i will have only the numbers by removing square boxes or some part of square boxes that is present in these images after then hopefully number/digit recognition will work.看,数字外面的完整方形框不完全可见,只有部分方形框可见。我想将这些图像转换为图像,通过删除方形框或方形框的某些部分,我将只有数字之后出现在这些图像中,希望数字/数字识别将起作用。 i tried this code :我试过这段代码:

import cv2
import numpy as np
import matplotlib.pyplot as plt

img = cv2.imread('/content/21.png')
gray=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
linek = np.zeros((11,11),dtype=np.uint8)
linek[...,5]=1
x=cv2.morphologyEx(gray, cv2.MORPH_OPEN, linek ,iterations=800)
gray-=x
plt.imshow(gray)
cv2.imwrite('21_output.jpg', gray)

output :输出 :

在此处输入图像描述

also tried this code :也试过这段代码:

import cv2
import numpy as np
import matplotlib.pyplot as plt

#https://stackoverflow.com/questions/57961119/how-to-remove-all-the-detected-lines-from-the-original-image-using-python

image = cv2.imread('/content/17.png')
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# Remove vertical
vertical_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1,10))
detected_lines = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, vertical_kernel, iterations=2)
cnts = cv2.findContours(detected_lines, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
    cv2.drawContours(image, [c], -1, (255,255,255), 2)

image = thresh - detected_lines
plt.imshow( image)

output :输出 :

在此处输入图像描述

unfortunately,it's not able to remove the unwanted lines completely.when it removes unwanted lines,it removes part of original digit/numbers as well.不幸的是,它无法完全删除不需要的行。当它删除不需要的行时,它也会删除部分原始数字/数字。 how can i remove those complete or incomplete square boxes outside each number in image?如何删除图像中每个数字之外的那些完整或不完整的方框? thanks in advance.提前致谢。

the code below for me is doing decent job but it's hyper parameter sensitive :下面的代码对我来说做得不错,但它对超参数敏感:

import cv2
import imutils
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import pyplot as plt
  

def square_number_box_denoiser(image_path="/content/9.png",is_resize = False, resize_width = 768):
    '''
    ref : https://pretagteam.com/question/removing-horizontal-lines-in-image-opencv-python-matplotlib

    Args : 
      image_path (str) : path of the image containing numbers/digits inside square box
      is_resize (int) : whether to resize the input image or not? default : False
      resize_width (int) : resizable image width for resizing the image by maintaining aspect ratio. default : 768 

    '''
    img=cv2.imread(image_path)
    if(is_resize):
      print("resizing...")
      img = imutils.resize(img, width=resize_width)
    image = cv2.rotate(img, cv2.ROTATE_90_CLOCKWISE)
    gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
    thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

    # Remove horizontal
    horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (25,1))
    detected_lines = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, horizontal_kernel, iterations=2)
    cnts = cv2.findContours(detected_lines, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    cnts = cnts[0] if len(cnts) == 2 else cnts[1]
    for c in cnts:
        cv2.drawContours(image, [c], -1, (255,255,255), 2)

    # Repair image
    repair_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1,6))
    result = 255 - cv2.morphologyEx(255 - image, cv2.MORPH_CLOSE, repair_kernel, iterations=2)

    # create figure
    fig = plt.figure(figsize=(20, 20))
    # setting values to rows and column variables
    rows = 3
    columns = 3

    fig.add_subplot(rows,  columns, 1)
    plt.imshow(img)
    fig.add_subplot(rows,  columns, 2)
    plt.imshow(thresh)
    fig.add_subplot(rows,  columns, 3)
    plt.imshow(detected_lines)
    fig.add_subplot(rows,  columns, 4)
    plt.imshow(image)
    fig.add_subplot(rows,  columns, 5)
    plt.imshow(result)
    result = cv2.rotate(result,cv2.ROTATE_90_COUNTERCLOCKWISE)
    fig.add_subplot(rows,  columns, 6)
    plt.imshow(result)
    cv2.imwrite("result.jpg", result)

    plt.show()

Outputs :输出: 在此处输入图像描述

without resizing :无需调整大小:

在此处输入图像描述

在此处输入图像描述

在此处输入图像描述

在此处输入图像描述

with 768 resizing : 768 调整大小:

在此处输入图像描述

在此处输入图像描述

在此处输入图像描述

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM