简体   繁体   English

使用 OpenCV 和 Python 识别数字(简单数字 OCR)

[英]Recognizing digits with OpenCV and Python (Simple digit OCR)

So I'm trying to create a program that can see what number an image is and print the integer in the console.因此,我正在尝试创建一个程序,该程序可以查看图像的编号并在控制台中打印整数。 (I'm using python 3) (我正在使用python 3)

For example that the program recognizes that the following image (an actual image the program has to check) is number 2:例如,程序识别出下面的图像(程序必须检查的实际图像)是 2 号:

2号

I've tried to just compare it with an other image with the 2 in it with cv2.matchTemplate() but each time the blue pixels rgb values are a little bit different for each image and the image could be a bit larger or smaller.我试图将它与另一个图像与cv2.matchTemplate()进行比较,但每次蓝色像素 rgb 值对于每个图像都有一点不同,并且图像可能会更大或更小。 for example the following image:例如下图:

2号

It also has to recognize it apart from al the other blue number images (0-9), for example the following one:除了所有其他蓝色数字图像(0-9)之外,它还必须识别它,例如以下一个:

5 号

I've tried mulitple match template codes, and make a folder with number 0-9 images as templates, but each time almost every single number is recognized in the number that needs to be recognized.我尝试了多个匹配模板代码,并制作了一个包含数字 0-9 图像的文件夹作为模板,但每次几乎每个数字都被识别为需要识别的数字。 for example number 5 gets recognized in an image that is number 2. And if its doesnt recognize all of them, it recognizes the wrong one(s).例如,数字 5 在数字 2 的图像中被识别。如果它不能识别所有这些,它会识别错误的图像。

The ones I've tried:我试过的那些:

but like I said before it comes with those problems.但就像我之前说的那样,它带来了这些问题。

I've also tried to see how much percentage blue is in each image, but those numbers were to close to tell the numbers appart by seeing how much blue was in them.我还尝试查看每张图像中蓝色的百分比,但这些数字要接近才能通过查看其中蓝色的含量来区分数字。

Does anyone have a solution?有没有人有办法解决吗? Am I being stupid for using cv2.matchTemplate() and is there a much simpler option?我使用cv2.matchTemplate()是不是很愚蠢,有没有更简单的选择? (I don't mind using a library for it, because this is part of a bigger piece of code, but I prefer to code it, instead of libraries) (我不介意为它使用库,因为这是一段更大的代码的一部分,但我更喜欢编写它,而不是库)

Instead of using Template Matching, a better approach is to use Pytesseract OCR to read the number with image_to_string() .而不是使用模板匹配,更好的方法是使用Pytesseract OCR通过 image_to_string image_to_string()读取数字。 But before performing OCR, you need to preprocess the image.但在执行 OCR 之前,您需要对图像进行预处理。 For optimal OCR performance, the preprocessed image should have the desired text/number/characters to OCR in black with the background in white .为获得最佳 OCR 性能,预处理图像应具有黑色 OCR 所需的文本/数字/字符,背景为白色 A simple preprocessing step is to convert the image to grayscale, Otsu's threshold to obtain a binary image, then invert the image.一个简单的预处理步骤是将图像转换为灰度、Otsu 的阈值以获得二值图像,然后将图像反转。 Here's a visualization of the preprocessing step:这是预处理步骤的可视化:

Input image -> Grayscale -> Otsu's threshold -> Inverted image ready for OCR输入图像->灰度-> Otsu 的阈值->准备 OCR 的反转图像

在此处输入图像描述 在此处输入图像描述 在此处输入图像描述 在此处输入图像描述

Result from Pytesseract OCR Pytesseract OCR 的结果

2 2

Here's the results with the other images:这是其他图像的结果:

在此处输入图像描述 在此处输入图像描述 在此处输入图像描述 在此处输入图像描述

2 2

在此处输入图像描述 在此处输入图像描述 在此处输入图像描述 在此处输入图像描述

5 5

We use the --psm 6 configuration option to assume a single uniform block of text.我们使用--psm 6配置选项来假设一个统一的文本块。 See here for more configuration options.有关更多配置选项,请参见此处

Code代码

import cv2
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

# Load image, grayscale, Otsu's threshold, then invert
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
invert = 255 - thresh

# Perfrom OCR with Pytesseract
data = pytesseract.image_to_string(invert, lang='eng', config='--psm 6')
print(data)

cv2.imshow('thresh', thresh)
cv2.imshow('invert', invert)
cv2.waitKey()

Note: If you insist on using Template Matching, you need to use scale variant template matching.注意:如果您坚持使用模板匹配,则需要使用比例变体模板匹配。 Take a look at how to isolate everything inside of a contour, scale it, and test the similarity to an image?看看如何隔离轮廓内的所有内容,对其进行缩放,并测试与图像的相似性? and Python OpenCV line detection to detect X symbol in image for some examples.Python OpenCV 线检测以检测图像中的 X 符号以获取一些示例。 If you know for certain that your images are blue, then another approach would be to use color thresholding with cv2.inRange() to obtain a binary mask image then apply OCR on the image.如果您确定您的图像是蓝色的,那么另一种方法是使用带有cv2.inRange()的颜色阈值来获得二进制蒙版图像,然后在图像上应用 OCR。

Given the lovely regular input, I expect that all you need is simple comparison to templates.鉴于可爱的常规输入,我希望您所需要的只是与模板进行简单比较。 Since you neglected to supply your code and output, it's hard to tell what might have gone wrong.由于您忽略了提供代码和输出,因此很难判断可能出了什么问题。

Very simply ...很简单...

  • Rescale your input to the size or your templates.将您的输入重新调整为大小或模板。
  • Calculate any straightforward matching evaluation on the input with each of the 10 templates.使用 10 个模板中的每一个计算对输入的任何直接匹配评估。 A simply matching count should suffice: how many pixels match between the two images.一个简单的匹配计数就足够了:两个图像之间有多少像素匹配。
  • The template with the highest score is the identification.得分最高的模板是标识。

You might also want to set a lower threshold for declaring a match, perhaps based on how well that template matches each of the other templates: any identification has to clearly exceed the match between two different templates.您可能还希望为声明匹配设置一个较低的阈值,可能基于该模板与其他每个模板的匹配程度:任何标识都必须明显超过两个不同模板之间的匹配。

If you don't have access to an OCR engine, just know you can build your own OCR system via a KNN classifier.如果您无法访问 OCR 引擎,只需知道您可以通过 KNN 分类器构建自己的 OCR 系统。 In this example, the implementation should not be very difficult, as you are only classifying numbers.在此示例中,实现应该不是很困难,因为您只是对数字进行分类。 OpenCV provides a very straightforward implementation of KNN. OpenCV 提供了一个非常简单的 KNN 实现。

The classifier is trained using features calculated from samples from known instances of classes.分类器使用从已知类实例的样本计算的特征进行训练。 In this case, you have 10 classes (if you are working with digits 0 - 9), so you can prepare a "template" with your digits, extract some features, train the classifier and use it to classify new instances.在这种情况下,您有 10 个类(如果您使用数字 0 - 9),因此您可以使用您的数字准备一个“模板”,提取一些特征,训练分类器并使用它来分类新实例。

All can be done in OpenCV without the need of extra libraries and the KNN (for this kind of application) has a more than acceptable accuracy rate.所有这些都可以在 OpenCV 中完成,无需额外的库,并且 KNN(对于这种应用程序)具有超过可接受的准确率。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM