[英]Tesseract unable to recognize numbers from a simple image
This is the image and I'm trying to extract "3158"这是图像,我正在尝试提取“3158”
And this is the code这是代码
import cv2 import tesseract img = cv2.imread('cropped.png') convert_to_string = pytesseract.image_to_string(img) print (convert_to_string)
But unfortunately it failed to print anything但不幸的是它没有打印任何东西
I've tried我试过了
pytesseract.image_to_string(img,config=' --psm 1 --oem 3)
and和
pytesseract.image_to_string(img,config=' --psm 6)
But still no luck但仍然没有运气
Try to binarize the image first, Tesseract does not work well if the font does not stand out clearly from the background.尝试先对图像进行二值化,如果字体不能从背景中清晰地突出,则 Tesseract 无法正常工作。 Since there's a gradient in the background, you may get some first good results with adaptive thresholding preprocessing:由于背景中有渐变,因此您可能会通过自适应阈值预处理获得一些初步的好结果:
import cv2
import pytesseract
img = cv2.imread('cropped.png')
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img_bin = cv2.adaptiveThreshold(
img_gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 11, 10
)
img_bin = cv2.cvtColor(img_bin, cv2.COLOR_GRAY2BGR)
convert_to_string = pytesseract.image_to_string(img_bin)
print(convert_to_string)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.