简体   繁体   中英

Unicode Decode Error while reading text from image

I have used this code to read text from an image file. Reading text from image

The code is as follows

from PIL import Image
from pytesseract import image_to_string

image = Image.open("image.jpg",'r')

myText = image_to_string(Image.open(open('maxresdefault.jpg')),config='-psm 10')
myText = image_to_string(Image.open(open('maxresdefault.jpg')))
print(myText)

Error: UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 278: character maps to

Tried to solve this error from following: UnicodeDecodeError: 'charmap' codec can't decode byte X in position Y: character maps to <undefined>

Then got error:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

根据Image文档( help(Image.open) ),图像文件必须以二进制模式打开:

open('maxresdefault.jpg', 'rb')

Load the Image in binary format.

Changing the following code solved the problem for me.

import PIL.Image
pil_image = PIL.Image.open(image_path, "rb")

Hope it helps !

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM