简体   繁体   中英

Text (computer written) recognition with python?

I have images that aren't timestamped and I need to rename and timestamp them for a project.

Here's an example of the timestamp, written at the top of every image:

在此处输入图片说明

So the question is, is there a way I can translate (detect) each number on the timestamp in the image shown? Perhaps I could use cv2 or tensorflow to do this?

Also I was thinking even training a decision tree if I were to crop each digit and create a series of same sized 1 channel arrays to train on.

Thoughts?

Why don't you use a simple OCR algorithm? The numbers and letters in the image is very clear, which I think an OCR algorithm will work fine.

A simple test with https://ocr.space/ produces the following:

****** Result for Image/Page 1 ******
14042018 Ph.n   

For sure, training a model will be helpful if you have the corresponding labels.

如果只有计算机编写的文本,则可能需要测试pytesseract的OCR。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM