[英]How to read a txt file of byte literals into UTF-8 strings?
I have a .txt file with lines like these: 我有一个.txt文件,其中包含以下内容:
b'Afrikaans'
b'\xe1\x8a\xa0\xe1\x88\x9b\xe1\x88\xad\xe1\x8a\x9b'
b'\xd0\x90\xd2\xa7\xd1\x81\xd1\x88\xd3\x99\xd0\xb0'
How can I turn these lines into UTF-8 strings so that the output is like these: 如何将这些行转换为UTF-8字符串,以便输出如下所示:
Afrikaans
አማርኛ
Аҧсшәа
I tried this but only got strings with the same values as the byte literals: 我试过了,但是只得到了与字节字面量具有相同值的字符串:
with open("encoded.txt", "rb") as filename:
line = filename.readline().strip()
while line:
print(line.decode("utf-8"))
line = filename.readline().strip()
The lines are Python literals, so ast.literal_eval
can parse them to Python byte strings: 这些行是Python文字,因此ast.literal_eval
可以将它们解析为Python字节字符串:
import ast
with open('data.txt') as f:
for line in f:
print(ast.literal_eval(line).decode('utf8'))
Output: 输出:
Afrikaans
አማርኛ
Аҧсшәа
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.