[英]Having problem with readline() method in python
I have a text file named as test.txt which only contains text: This is line 1我有一个名为test.txt的文本文件,它只包含文本:这是第 1 行
file = open("test.txt")
print(file.readline())
now when I run this code, I get output:现在,当我运行此代码时,我得到 output:
ÿþThis is line 1
Why am I getting this ÿþ at the beginning of the output??为什么我在output的开头得到这个 ÿþ?
Your file is UTF-16 encoded with a BOM prefix (to indicate the byte order), but your locale's default encoding is a Western European-like locale (eg latin-1
or cp1252
), which interprets the BOM bytes (0xff followed by 0xfe) as ÿþ
.您的文件是 UTF-16 编码的,带有BOM前缀(以指示字节顺序),但您的语言环境的默认编码是类似西欧的语言环境(例如
latin-1
或cp1252
),它解释 BOM 字节(0xff 后跟 0xfe ) 作为ÿþ
。 The extraneous NUL
bytes between each ASCII character that UTF-16 includes are likely being ignored. UTF-16 包含的每个 ASCII 字符之间的无关
NUL
字节可能会被忽略。
Explicitly providing the correct encoding to open
will let it seamlessly decode correctly, eg显
open
提供正确的编码以使其正确无缝解码,例如
with open("test.txt", encoding='utf-16') as file:
print(file.readline())
Note that I switched to using a with
statement for deterministic cleanup;请注意,我转而使用
with
语句进行确定性清理; the only changed needed to fix the code is adding encoding='utf-16'
to the open
arguments.修复代码所需的唯一更改是将
encoding='utf-16'
添加到open
的 arguments 中。
try using this尝试使用这个
file1 = open("test.txt","r", encoding = "utf-16")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.