简体   繁体   中英

Python - Read a file which contains bytes and strings

I work on a program in python which analyze files and keep only what I want in these files. I have an error when I open some files. These files contains string and bytes like that :

file.py:
if byte == "0xFD":
    for byte in bytes.split["0xFD"]:
...

When I open that type of files, the strings present between quotes are interpreted as bytes and that makes the program crashed : UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 59752: character maps to <undefined> . Same error with 'utf-8'.

So my question is: how can I read that line without interpreted the byte (I want to keep the line like that)?

If you add rb in your open function, it will read the bytes as UTF-8.

file = open("insertfilenamehere.txt", "rb")
txt = file.read()

I found a way to do what I want : with open(file_path, "r", encoding="utf-8", errors='replace') as file:

I found some help on this post : Unicode error handling with Python 3's readlines()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM