简体   繁体   English

python 中的 readline() 方法有问题

[英]Having problem with readline() method in python

I have a text file named as test.txt which only contains text: This is line 1我有一个名为test.txt的文本文件,它只包含文本:这是第 1 行

file = open("test.txt")
print(file.readline())

now when I run this code, I get output:现在,当我运行此代码时,我得到 output:

ÿþThis is line 1 

Why am I getting this ÿþ at the beginning of the output??为什么我在output的开头得到这个 ÿþ?

Your file is UTF-16 encoded with a BOM prefix (to indicate the byte order), but your locale's default encoding is a Western European-like locale (eg latin-1 or cp1252 ), which interprets the BOM bytes (0xff followed by 0xfe) as ÿþ .您的文件是 UTF-16 编码的,带有BOM前缀(以指示字节顺序),但您的语言环境的默认编码是类似西欧的语言环境(例如latin-1cp1252 ),它解释 BOM 字节(0xff 后跟 0xfe ) 作为ÿþ The extraneous NUL bytes between each ASCII character that UTF-16 includes are likely being ignored. UTF-16 包含的每个 ASCII 字符之间的无关NUL字节可能会被忽略。

Explicitly providing the correct encoding to open will let it seamlessly decode correctly, egopen提供正确的编码以使其正确无缝解码,例如

with open("test.txt", encoding='utf-16') as file:
    print(file.readline())

Note that I switched to using a with statement for deterministic cleanup;请注意,我转而使用with语句进行确定性清理; the only changed needed to fix the code is adding encoding='utf-16' to the open arguments.修复代码所需的唯一更改是将encoding='utf-16'添加到open的 arguments 中。

try using this尝试使用这个

file1 = open("test.txt","r", encoding = "utf-16")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM