Tesseract 无法从简单图像中识别数字

Question

This is the image and I'm trying to extract "3158"这是图像，我正在尝试提取“3158”

And this is the code这是代码

import cv2 import tesseract img = cv2.imread('cropped.png') convert_to_string = pytesseract.image_to_string(img) print (convert_to_string)

But unfortunately it failed to print anything但不幸的是它没有打印任何东西

I've tried我试过了

pytesseract.image_to_string(img,config=' --psm 1 --oem 3)

and和

pytesseract.image_to_string(img,config=' --psm 6)

But still no luck但仍然没有运气

Answer 1

Try to binarize the image first, Tesseract does not work well if the font does not stand out clearly from the background.尝试先对图像进行二值化，如果字体不能从背景中清晰地突出，则 Tesseract 无法正常工作。 Since there's a gradient in the background, you may get some first good results with adaptive thresholding preprocessing:由于背景中有渐变，因此您可能会通过自适应阈值预处理获得一些初步的好结果：

import cv2
import pytesseract

img = cv2.imread('cropped.png')
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

img_bin = cv2.adaptiveThreshold(
    img_gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 11, 10
)

img_bin = cv2.cvtColor(img_bin, cv2.COLOR_GRAY2BGR)

convert_to_string = pytesseract.image_to_string(img_bin)
print(convert_to_string)

Tesseract 无法从简单图像中识别数字

问题描述

1 个解决方案

解决方案1
1 2022-05-15 07:49:20

Tesseract 无法从简单图像中识别数字

问题描述

1 个解决方案

解决方案1 1 2022-05-15 07:49:20

解决方案1
1 2022-05-15 07:49:20