使用 pytesseract 改进图像的 OCR 结果

Question

我正在使用 pytesseract 从屏幕上实时读取数字。 图像主要是数字、圆点和 2 个字母（M 和 R），如下所示。 实时数字会不断变化，但字母 M 和 R 将保持不变。 背景总是带有黑色字母的绿色。

如您所见，图像上的数字非常清晰，但 pytesseract 读取了数字，结果并不真正令人满意。 有时它的读数 7 变为 1。我想找到有助于提高 OCR 结果的算法。

目前我正在使用 Pillow 将图像转换为灰度，并尝试将图像大小调整为更大或更小，但仍能大大改善结果。 还对图像应用了过滤器，如下所示，但结果仍然不是 100% 正确。

img = cv2.imread('screenshot.png')
img = cv2.resize(img, None, fx=scale_factor, fy=scale_factor, interpolation=cv2.INTER_CUBIC)
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img = cv2.threshold(cv2.bilateralFilter(img, 5, 75, 75), 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
text = tess.image_to_string(img)

请帮助建议任何有助于改进此 OCR 结果的算法。

Answer 1

您可以轻松检测应用简单阈值

临界点	结果
	3845.86 M51.31 M 309.12 3860.43 R191.90 R23.44

阈值化将显示图像的特征。

代码：

import cv2
import pytesseract

img = cv2.imread("UEWHj.png")
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
thr = cv2.threshold(gry, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
txt = pytesseract.image_to_string(thr)
print(txt)
cv2.imshow("thr", thr)
cv2.waitKey(0)

使用 pytesseract 改进图像的 OCR 结果

问题描述

1 个解决方案

解决方案1
1 2021-02-14 09:34:03

使用 pytesseract 改进图像的 OCR 结果

问题描述

1 个解决方案

解决方案1 1 2021-02-14 09:34:03

解决方案1
1 2021-02-14 09:34:03