简体   繁体   中英

Tesseract-OCR not recognizing digits

I using Tesseract OCR to recognizing my below picture (it is an image meter electric) but it not working. I have not permitted to use Machine learning or deep learning. Does anyone have some other technique that I can use to solve my problem? please let give to me a guide. Thank you for reading.

This my root image: 在此处输入图像描述

This image that I have processed must to recognizing digits
在此处输入图像描述

This my code:

import cv2
import pytesseract as pts
pts.pytesseract.tesseract_cmd = r'C:\Users\Thep Ho\AppData\Local\Programs\Tesseract-OCR\tesseract.exe'

img = cv2.imread("images/text1.jpg")
text = pts.image_to_string(img)
print(text)
  • If you apply adaptive-thresholding to the input image:

  • 在此处输入图像描述

  • Now, if you apply regular-expression to remove all non-numeric variables from the extracted text:

    •  99951

Code:


import re
import cv2
import pytesseract

img = cv2.imread("Eadxj.png")
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
flt = cv2.adaptiveThreshold(gry,
                            252, cv2.ADAPTIVE_THRESH_MEAN_C,
                            cv2.THRESH_BINARY_INV, 31, 7)
txt = pytesseract.image_to_string(flt)
txt_int = re.sub("[^0-9]", "", txt)
print(txt_int)

But if you are allowed to use deep-learning , result will be:

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM