python 中的 readline() 方法有问题

Question

I have a text file named as test.txt which only contains text: This is line 1我有一个名为test.txt的文本文件，它只包含文本：这是第 1 行

file = open("test.txt")
print(file.readline())

now when I run this code, I get output:现在，当我运行此代码时，我得到 output：

ÿþThis is line 1

Why am I getting this ÿþ at the beginning of the output??为什么我在output的开头得到这个 ÿþ？

Answer 1

Your file is UTF-16 encoded with a BOM prefix (to indicate the byte order), but your locale's default encoding is a Western European-like locale (eg latin-1 or cp1252 ), which interprets the BOM bytes (0xff followed by 0xfe) as ÿþ .您的文件是 UTF-16 编码的，带有BOM前缀（以指示字节顺序），但您的语言环境的默认编码是类似西欧的语言环境（例如latin-1或cp1252 ），它解释 BOM 字节（0xff 后跟 0xfe ) 作为ÿþ 。 The extraneous NUL bytes between each ASCII character that UTF-16 includes are likely being ignored. UTF-16 包含的每个 ASCII 字符之间的无关NUL字节可能会被忽略。

Explicitly providing the correct encoding to open will let it seamlessly decode correctly, eg显open提供正确的编码以使其正确无缝解码，例如

with open("test.txt", encoding='utf-16') as file:
    print(file.readline())

Note that I switched to using a with statement for deterministic cleanup;请注意，我转而使用with语句进行确定性清理； the only changed needed to fix the code is adding encoding='utf-16' to the open arguments.修复代码所需的唯一更改是将encoding='utf-16'添加到open的 arguments 中。

Answer 2

try using this尝试使用这个

file1 = open("test.txt","r", encoding = "utf-16")

python 中的 readline() 方法有问题

问题描述

2 个解决方案

解决方案1
1 2020-05-01 00:22:30

解决方案2
-1 2020-05-01 00:17:07

python 中的 readline() 方法有问题

问题描述

2 个解决方案

解决方案1 1 2020-05-01 00:22:30

解决方案2 -1 2020-05-01 00:17:07

解决方案1
1 2020-05-01 00:22:30

解决方案2
-1 2020-05-01 00:17:07