Unicode Decode Error while reading text from image

Question

I have used this code to read text from an image file. Reading text from image

The code is as follows

from PIL import Image
from pytesseract import image_to_string

image = Image.open("image.jpg",'r')

myText = image_to_string(Image.open(open('maxresdefault.jpg')),config='-psm 10')
myText = image_to_string(Image.open(open('maxresdefault.jpg')))
print(myText)

Error: UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 278: character maps to

Tried to solve this error from following: UnicodeDecodeError: 'charmap' codec can't decode byte X in position Y: character maps to <undefined>

Then got error:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

Answer 1

根据Image文档（ help(Image.open) ），图像文件必须以二进制模式打开：

open('maxresdefault.jpg', 'rb')

Answer 2

Load the Image in binary format.

Changing the following code solved the problem for me.

import PIL.Image
pil_image = PIL.Image.open(image_path, "rb")

Hope it helps !

Unicode Decode Error while reading text from image

Question

2 answers

solution1
0 2018-02-12 08:36:48

solution2
0 2019-12-13 15:21:54

Unicode Decode Error while reading text from image

Question

2 answers

solution1 0 2018-02-12 08:36:48

solution2 0 2019-12-13 15:21:54

solution1
0 2018-02-12 08:36:48

solution2
0 2019-12-13 15:21:54