简体   繁体   中英

Tesseract unable to recognize numbers from a simple image

This is the image and I'm trying to extract "3158"

在此处输入图像描述

And this is the code

import cv2 import tesseract img = cv2.imread('cropped.png') convert_to_string = pytesseract.image_to_string(img) print (convert_to_string)

But unfortunately it failed to print anything

I've tried

pytesseract.image_to_string(img,config=' --psm 1 --oem 3)

and

pytesseract.image_to_string(img,config=' --psm 6)

But still no luck

Try to binarize the image first, Tesseract does not work well if the font does not stand out clearly from the background. Since there's a gradient in the background, you may get some first good results with adaptive thresholding preprocessing:

import cv2
import pytesseract

img = cv2.imread('cropped.png')
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

img_bin = cv2.adaptiveThreshold(
    img_gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 11, 10
)

img_bin = cv2.cvtColor(img_bin, cv2.COLOR_GRAY2BGR)

convert_to_string = pytesseract.image_to_string(img_bin)
print(convert_to_string)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM