简体   繁体   中英

OCR with tesseract, pre-processing image

I need to extract digits from images like the one shown below, I'm using tesseract now, but it isn't working. Can anyone help me in pre-processing the images before feeding it to tesseract?

在此输入图像描述

I don't think tesseract is the right tool for it, Tesseract can only handle very clear letters.
If your numbers are all like those in the picture you can use opencv ORB detector https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_feature2d/py_orb/py_orb.html
Or if it don't work, you can use some deeplearning aproch, as a SSD Keras or YOLO.
https://github.com/pierluigiferrari/ssd_keras
Another option is to dismember the numbers (it is easy if is all the same size) and create a very simple convolutional neural network with keras.
https://keras.io/

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM