简体   繁体   English

Tesseract 无法从简单图像中识别数字

[英]Tesseract unable to recognize numbers from a simple image

This is the image and I'm trying to extract "3158"这是图像,我正在尝试提取“3158”

在此处输入图像描述

And this is the code这是代码

import cv2 import tesseract img = cv2.imread('cropped.png') convert_to_string = pytesseract.image_to_string(img) print (convert_to_string)

But unfortunately it failed to print anything但不幸的是它没有打印任何东西

I've tried我试过了

pytesseract.image_to_string(img,config=' --psm 1 --oem 3)

and

pytesseract.image_to_string(img,config=' --psm 6)

But still no luck但仍然没有运气

Try to binarize the image first, Tesseract does not work well if the font does not stand out clearly from the background.尝试先对图像进行二值化,如果字体不能从背景中清晰地突出,则 Tesseract 无法正常工作。 Since there's a gradient in the background, you may get some first good results with adaptive thresholding preprocessing:由于背景中有渐变,因此您可能会通过自适应阈值预处理获得一些初步的好结果:

import cv2
import pytesseract

img = cv2.imread('cropped.png')
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

img_bin = cv2.adaptiveThreshold(
    img_gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 11, 10
)

img_bin = cv2.cvtColor(img_bin, cv2.COLOR_GRAY2BGR)

convert_to_string = pytesseract.image_to_string(img_bin)
print(convert_to_string)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM