简体   繁体   English

Python从文件读取和写入'ß'

[英]Python read and write 'ß' from file

I have a file.txt with the input我有一个带有输入的 file.txt

Straße
Straße 1
Straße 2

I want to read this text from file and print it.我想从文件中读取此文本并打印出来。 I tried this, but it won´t work.我试过这个,但它不起作用。

lmao1 = open('file.txt').read().splitlines()
lmao =random.choice(lmao1)
print str(lmao).decode('utf8')

But I get the error:但我收到错误:

UnicodeDecodeError: 'utf8' codec can't decode byte 0xdf in position 5: invalid continuation byte UnicodeDecodeError: 'utf8' 编解码器无法解码位置 5 中的字节 0xdf:继续字节无效

Got it.明白了。 If this doesn't work try other common encodings until you find the right one.如果这不起作用,请尝试其他常用编码,直到找到合适的编码。 utf-8 is not the correct encoding. utf-8 不是正确的编码。

print str(lmao).decode('latin-1')

If on Windows, the file is likely encoded in cp1252 .如果在 Windows 上,该文件可能以cp1252编码。

Whatever the encoding, use io.open and specify the encoding.无论编码如何,请使用io.open并指定编码。 This code will work in both Python 2 and 3.此代码适用于 Python 2 和 3。

io.open will return Unicode strings. io.open将返回 Unicode 字符串。 It is good practice to immediately convert to/from Unicode at the I/O boundaries of your program.在程序的 I/O 边界处立即与 Unicode 进行转换是一种很好的做法。 In this case that means reading the file as Unicode in the first place and leaving print to determine the appropriate encoding for the terminal.在这种情况下,这意味着首先将文件作为 Unicode 读取并留下print来确定终端的适当编码。

Also recommended is to switch to Python 3 where Unicode handling is greatly improved.还建议切换到 Python 3,其中 Unicode 处理得到了极大的改进。

from __future__ import print_function
import io
import random
with io.open('file.txt',encoding='cp1252') as f:
    lines = f.read().splitlines()
line = random.choice(lines)
print(line)

You're on the right track, regarding decode , the problem is only there is no way to guess the encoding of a file 100%.您走在正确的轨道上,关于decode ,问题只是无法100%猜测文件的编码。 Try a different encoding (eg latin-1 ).尝试不同的编码(例如latin-1 )。

It's working fine on Python prompt and while running from python script as well.它在 Python 提示符和从 python 脚本运行时运行良好。

>>> import random
>>> lmao =random.choice(lmao1)
>>> lmao =random.choice(lmao1)
>>> print str(lmao).decode('utf8')
Straße 2

The above worked on Python 2.7.以上适用于 Python 2.7。 May I know your python version ?我可以知道你的python版本吗?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM