简体   繁体   中英

Pytesseract can not recognize even very simple textline

Binary image B2 Binary image Y2

I think these images are quite simple and clear. Still pytesseract does not work. I really wonder why.

Here is my code

from pytesseract import pytesseract as tesseract
import cv2 as cv

binary = cv.imread(filepath)

lang = 'eng'
config = 'tessedit_char_whitelist=RGB123'
print(tesseract.image_to_string(binary, lang=lang, config=config))

The output is just blank string.

To Dennlinger's point, I would definitely rotate it before sending it through PyTess. PyTess should rotate it automatically though. Should.

Alternatively, I see in your configuration that you have white listed "RGB123" which, correct me if I'm wrong, may mean that PyTess is mainly looking for those specific numbers and characters.

I'd try changing your configuration by omiting that configuration so that it can pick up the "Y" in there.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM