Tesseract unable to recognize numbers from a simple image

Question

This is the image and I'm trying to extract "3158"

And this is the code

import cv2 import tesseract img = cv2.imread('cropped.png') convert_to_string = pytesseract.image_to_string(img) print (convert_to_string)

But unfortunately it failed to print anything

I've tried

pytesseract.image_to_string(img,config=' --psm 1 --oem 3)

and

pytesseract.image_to_string(img,config=' --psm 6)

But still no luck

Answer 1

Try to binarize the image first, Tesseract does not work well if the font does not stand out clearly from the background. Since there's a gradient in the background, you may get some first good results with adaptive thresholding preprocessing:

import cv2
import pytesseract

img = cv2.imread('cropped.png')
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

img_bin = cv2.adaptiveThreshold(
    img_gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 11, 10
)

img_bin = cv2.cvtColor(img_bin, cv2.COLOR_GRAY2BGR)

convert_to_string = pytesseract.image_to_string(img_bin)
print(convert_to_string)

Tesseract unable to recognize numbers from a simple image

Question

1 answers

solution1
1 2022-05-15 07:49:20

Tesseract unable to recognize numbers from a simple image

Question

1 answers

solution1 1 2022-05-15 07:49:20

solution1
1 2022-05-15 07:49:20