简体   繁体   中英

Extract text from image with pytesseract

I tried extract numbers from original image https://imgur.com/a/adMaKGy , but with no luck.

Output from pytesseract is: "[a ]:[4] G2):Go] [7 ):Ce J"

Thank you for advice,

My code:

import pytesseract
import cv2
pytesseract.pytesseract.tesseract_cmd = 'folder /tesseract.exe'
img = cv2.imread("folder /test_image.png")
text = pytesseract.image_to_string(img)
print(text)

The README says that OpenCV images are in BGR format and pytesseract assumes RGB format, so you need to convert it

import cv2

img_cv = cv2.imread(r'/<path_to_image>/digits.png')

# By default OpenCV stores images in BGR format and since pytesseract assumes RGB format,
# we need to convert from BGR to RGB format/mode:
img_rgb = cv2.cvtColor(img_cv, cv2.COLOR_BGR2RGB)
print(pytesseract.image_to_string(img_rgb))
# OR
img_rgb = Image.frombytes('RGB', img_cv.shape[:2], img_cv, 'raw', 'BGR', 0, 0)
print(pytesseract.image_to_string(img_rgb))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM