![](/img/trans.png)
[英]How can I recognize numbers from a picture using python and an OCR engine?
[英]OCR - How to recognize numbers inside square boxes using python?
光学字符识别 (ocr) 的一个问题是,当数字位于方框内时,它无法正确识别数字。 此处讨论了 tesseract 的一个失败示例: Tesseract - How can I identify numbers in box? 我在这里用 paddleocr 进行测试: https ://www.paddlepaddle.org.cn/hub/scene/ocr 你也可以快速尝试这个 api,对于这个输入图像:
它成功返回所有数字。大多数情况下,这些数字识别(打印和手写)在正方形框内时失败。为了识别正方形框内的数字,我们需要将这些所谓的盒子图像中的数字转换为图像中的数字删除所有方形框。 我有一些如下图:
看,数字外面的完整方形框不完全可见,只有部分方形框可见。我想将这些图像转换为图像,通过删除方形框或方形框的某些部分,我将只有数字之后出现在这些图像中,希望数字/数字识别将起作用。 我试过这段代码:
import cv2
import numpy as np
import matplotlib.pyplot as plt
img = cv2.imread('/content/21.png')
gray=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
linek = np.zeros((11,11),dtype=np.uint8)
linek[...,5]=1
x=cv2.morphologyEx(gray, cv2.MORPH_OPEN, linek ,iterations=800)
gray-=x
plt.imshow(gray)
cv2.imwrite('21_output.jpg', gray)
输出 :
也试过这段代码:
import cv2
import numpy as np
import matplotlib.pyplot as plt
#https://stackoverflow.com/questions/57961119/how-to-remove-all-the-detected-lines-from-the-original-image-using-python
image = cv2.imread('/content/17.png')
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Remove vertical
vertical_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1,10))
detected_lines = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, vertical_kernel, iterations=2)
cnts = cv2.findContours(detected_lines, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
cv2.drawContours(image, [c], -1, (255,255,255), 2)
image = thresh - detected_lines
plt.imshow( image)
输出 :
不幸的是,它无法完全删除不需要的行。当它删除不需要的行时,它也会删除部分原始数字/数字。 如何删除图像中每个数字之外的那些完整或不完整的方框? 提前致谢。
下面的代码对我来说做得不错,但它对超参数敏感:
import cv2
import imutils
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import pyplot as plt
def square_number_box_denoiser(image_path="/content/9.png",is_resize = False, resize_width = 768):
'''
ref : https://pretagteam.com/question/removing-horizontal-lines-in-image-opencv-python-matplotlib
Args :
image_path (str) : path of the image containing numbers/digits inside square box
is_resize (int) : whether to resize the input image or not? default : False
resize_width (int) : resizable image width for resizing the image by maintaining aspect ratio. default : 768
'''
img=cv2.imread(image_path)
if(is_resize):
print("resizing...")
img = imutils.resize(img, width=resize_width)
image = cv2.rotate(img, cv2.ROTATE_90_CLOCKWISE)
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Remove horizontal
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (25,1))
detected_lines = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, horizontal_kernel, iterations=2)
cnts = cv2.findContours(detected_lines, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
cv2.drawContours(image, [c], -1, (255,255,255), 2)
# Repair image
repair_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1,6))
result = 255 - cv2.morphologyEx(255 - image, cv2.MORPH_CLOSE, repair_kernel, iterations=2)
# create figure
fig = plt.figure(figsize=(20, 20))
# setting values to rows and column variables
rows = 3
columns = 3
fig.add_subplot(rows, columns, 1)
plt.imshow(img)
fig.add_subplot(rows, columns, 2)
plt.imshow(thresh)
fig.add_subplot(rows, columns, 3)
plt.imshow(detected_lines)
fig.add_subplot(rows, columns, 4)
plt.imshow(image)
fig.add_subplot(rows, columns, 5)
plt.imshow(result)
result = cv2.rotate(result,cv2.ROTATE_90_COUNTERCLOCKWISE)
fig.add_subplot(rows, columns, 6)
plt.imshow(result)
cv2.imwrite("result.jpg", result)
plt.show()
无需调整大小:
768 调整大小:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.