简体   繁体   中英

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe3 in position 1: invalid continuation byte

I want to convert a byte variable to string. Of course, there are previous questions related to mine. However, trying to hash in md5() the content of a file this way:

import hashlib
with open("C:\\boot.ini","r") as f:
    r=f.read()
a=hashlib.md5()
a.update(r.encode('utf8'))
bytes_data=a.digest()
print(bytes_data)
r=type(bytes_data)
print(r) # <-- Just to be sure, it is in bytes 
myString=bytes_data.decode(encoding='UTF-8')

I got this error:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe3 in position 1: invalid continuation byte

I understand the reason of my problem thanks to this question , however I am dealing with different files to calculate their hash, so I have no control on the bytes, so how can I resolve this problem ?

The hash.digest() return value is not a UTF-8-encoded string. Don't try to decode it; it is a sequence of bytes in the range 0-255 and these bytes do not represent text.

Not all bytes contents encode text; this is one such value.

Use hash.hexdigest() if you want something printable instead. This method returns the bytes expressed as hexadecimal numbers instead (two hex characters per digest byte). This is the commonly used form when sharing a MD5 digest with others.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM