简体   繁体   English

Tesseract 输出在非常清晰的图像中更改、添加和删除数字

[英]Tesseract output changing, adding, and removing numbers from very clear image

I am working on a program that uses a webcam to read constantly changing digits off of a screen using pytesseract (long story).我正在开发一个程序,该程序使用网络摄像头使用 pytesseract(长篇故事)从屏幕上读取不断变化的数字。 It takes an image of the whole screen, then cuts out each number needed to be recorded (there are 23 of them) using predetermined coordinates stored in the list called 'roi'.它拍摄整个屏幕的图像,然后使用存储在名为“roi”的列表中的预定坐标切出需要记录的每个数字(其中有 23 个)。 There are some other steps but this is the most important part.还有一些其他步骤,但这是最重要的部分。 Currently it is adding, deleting, and changing numbers constantly, but not consistently .目前它正在不断地添加、删除和更改数字,但并非始终如一 Here are some examples:这里有些例子:

It reads this incorrectly as '32.0'它错误地将其读作“32.0”1

It reads this correctly as '52.0'它正确读取为“52.0”2

It reads this incorrectly as '39.3'它错误地将其读作“39.3”3

It reads this incorrectly as '2499.1'它错误地将其读作“2499.1”4

These images have already been processed using OpenCV, and it's what all the images in the roi set look like.这些图像已经使用 OpenCV 处理过,这就是 roi 集中所有图像的样子。 Based on other answers, I have binarized it, tried to clean up the edges, and put a white border around the image (see code).根据其他答案,我已将其二值化,尝试清理边缘,并在图像周围放置一个白色边框(请参阅代码)。

This program reads the screen every 30 seconds, sometimes getting it right, other times getting it wrong.该程序每 30 秒读取一次屏幕,有时正确,有时出错。 Many times it likes change 5s into 3s, 3s into 5s, and 5s into 9s.很多时候它喜欢把5s变成3s,3s变成5s,5s变成9s。 Sometimes it just misses or adds digits altogether.有时它只是错过或完全添加数字。 Below is my code for processing the images.下面是我处理图像的代码。

pytesseract.pytesseract.tesseract_cmd = #tesseract file path
scale = 1.4
img = cv2.imread(#image file path#)
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img = cv2.rotate(img, cv2.ROTATE_180)
width = int(img.shape[1] / scale)
height = int(img.shape[0] / scale)
dim = (width, height)
img = cv2.resize(img, dim, interpolation=cv2.INTER_AREA)                                    
cv2.destroyAllWindows()

myData = []
cong = r'--psm 6 -c tessedit_char_whitelist=+0123456789.-'

for x,r in enumerate(roi):                                                                 
    imgCrop = img[r[0][1]:r[1][1], r[0][0]:r[1][0]]        
    scalebig = 0.2
    wid = int(imgCrop.shape[1] / scalebig)
    hei = int(imgCrop.shape[0] / scalebig)
    newdims = (wid, hei)
    imgCrop = cv2.resize(imgCrop, newdims)

    imgCrop = cv2.threshold(imgCrop,155,255,cv2.THRESH_BINARY)[1]

    kernel2 = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))                              
    imgCrop = cv2.morphologyEx(imgCrop, cv2.MORPH_CLOSE, kernel2, iterations=2)

    value = [255,255,255]
    imgCrop = cv2.copyMakeBorder(imgCrop, 10, 10, 10, 10, cv2.BORDER_CONSTANT, None, value = value)

    datapoint = pytesseract.image_to_string(imgCrop, lang='eng', config=cong)
    myData.append(datapoint)

The output is the pictures I linked above.输出是我上面链接的图片。

I have looked into fine tuning it, but I have a Windows machine and I can't seem to find a good tutorial.我已经研究过微调它,但我有一台 Windows 机器,我似乎找不到一个好的教程。 I am not a programmer by trade, I spent 2 months teaching myself Python to do this, but the machine learning aspect of Tesseract has me spinning, and I don't know how else to fix remarkably inconsistent readings.我不是专业的程序员,我花了 2 个月的时间自学 Python 来做到这一点,但是 Tesseract 的机器学习方面让我感到困惑,而且我不知道如何解决非常不一致的读数。 If you need any further info please ask and I'll be happy to tell you.如果您需要任何进一步的信息,请询问,我很乐意告诉您。

Edit: Added some more incorrectly read images for reference编辑:添加了更多错误读取的图像以供参考

  1. Make sure you use the right image format (jpeg is the wrong format for OCR)确保使用正确的图像格式(jpeg 是 OCR 的错误格式)
  2. In the case of the tesseract LSTM engine make sure the letter size is not bigger than 35 points.在 tesseract LSTM 引擎的情况下,确保字母大小不大于 35 磅。

With tesseract best_tessdata I got these results:使用 tesseract best_tessdata 我得到了这些结果:

在此处输入图片说明

tesseract 593_small.png -
59.3

在此处输入图片说明

tesseract 520_small.png -
52.0

在此处输入图片说明

tesseract 2491_small.png -
249.1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM