Pytesseract image_to_data 无法读取我图像中的数字

Question

So I'm currently working on a project where I use pyautogui and pytesseract to take a screenshot of the time in a video game emulator I'm using, and then to try and read the image and determine what time I got.所以我目前正在做一个项目，我使用 pyautogui 和 pytesseract 在我正在使用的视频游戏模拟器中截取时间，然后尝试读取图像并确定我的时间。 Here's what the image looks like when I use pyautogui to get the screenshot of the region I want:这是我使用 pyautogui 获取所需区域的屏幕截图时的图像：

在游戏计时器中

Simply just using pytesseract.image_to_string() worked with images of text when I tested it out to make sure it was installed properly, but when I use the in game timer picture it doesn't output anything.当我测试它以确保它安装正确时，只需使用pytesseract.image_to_string()就可以处理文本图像，但是当我使用游戏中的计时器图片时，它不会输出任何内容。 Does this have to do with the quality of the image or some imitation with pytesseract or what?这是否与图像质量或 pytesseract 的某些模仿有关？

Answer 1

You need to preprocess the image before performing OCR with Pytesseract.在使用 Pytesseract 执行 OCR 之前，您需要对图像进行预处理。 Here's a simple approach using OpenCV and Pytesseract OCR.这是使用 OpenCV 和 Pytesseract OCR 的简单方法。 The idea is to obtain a processed image where the text to extract is in black with the background in white.这个想法是获得一个处理过的图像，其中要提取的文本是黑色的，背景是白色的。 To do this, we can convert to grayscale , apply a slight Gaussian blur , then Otsu's threshold to obtain a binary image.为此，我们可以转换为灰度，应用轻微的高斯模糊，然后使用Otsu 阈值来获得二值图像。 We perform text extraction using the --psm 6 configuration option to assume a single uniform block of text.我们使用--psm 6配置选项执行文本提取，以假设单个统一的文本块。 Take a look here for more options. 在这里查看更多选项。

Input image输入图像

Otsu's threshold to get a binary image Otsu 获取二值图像的阈值

Result from Pytesseract OCR Pytesseract OCR 的结果

0’ 12”92

Code代码

import cv2
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

# Grayscale, Gaussian blur, Otsu's threshold
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3,3), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# Perform text extraction
data = pytesseract.image_to_string(thresh, lang='eng', config='--psm 6')
print(data)

cv2.imshow('thresh', thresh)
cv2.waitKey()

Pytesseract image_to_data 无法读取我图像中的数字

问题描述

1 个解决方案

解决方案1
2 已采纳 2022-05-11 01:49:36

Pytesseract image_to_data 无法读取我图像中的数字

问题描述

1 个解决方案

解决方案1 2 已采纳 2022-05-11 01:49:36

解决方案1
2 已采纳 2022-05-11 01:49:36