简体   繁体   English

从文本文件读取时的Unicode编码

[英]Unicode encoding when reading from text file

I hope you can help. 希望您能提供帮助。

I'm trying to take a string and check whether or not it is in a text file called PasswordList. 我正在尝试获取一个字符串,并检查它是否在名为PasswordList的文本文件中。 This is the code I have written to do this: 这是我为此编写的代码:

Password = input('Enter a password: ')    
with open('PasswordList.txt') as f:
    Found = False
    for line in f:
        if Password in line: 
            print(line)
            Found = True
    if not Found:
        print('Password is not in list')

If I put in something like the letter "e", it will return the lines which contain it until it hits position 4583 where it returns an error: 如果我输入类似字母“ e”的内容,它将返回包含它的行,直到到达位置4583并返回错误为止:

UnicodeDecodeError: 'ascii' codec can't decode byte 0x82 in position 4853: ordinal not in range(128).

I guess that it has to do with encoding between ascii and unicode, as in Python is trying to use the ascii codec to decode a unicode character? 我猜想这与ascii和unicode之间的编码有关,就像在Python中尝试使用ascii编解码器解码unicode字符一样?

If I try 如果我尝试

print (str((sys.getdefaultencoding())))

Then I get "utf-8" as the default encoding. 然后我得到“ utf-8”作为默认编码。

I'm stuck, what can I do? 我被卡住了,该怎么办?

Opening the file with the io module: 使用io模块打开文件:

import io
with io.open('PasswordList.txt', encoding='cp1252') as f:
    ...

However, you do need to know what encoding the data is in. The file itself usually doesn't contain this information, you have to know how it was created. 但是,您确实需要知道数据的编码方式。文件本身通常不包含此信息,因此必须知道如何创建。

To determine the encoding of a file created with Notepad, open the file in Notepad. 若要确定使用记事本创建的文件的编码,请在记事本中打开文件。 Select File | 选择文件| Save as from the menu. 从菜单另存为。 Near the bottom of the dialog, the current encoding appears in a dropdown (screenshot attached). 在对话框底部附近,当前编码显示在一个下拉列表中(附有截屏)。

Now you can try using codecs.open as suggested by wim. 现在,您可以尝试按照wim的建议使用codecs.open。

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM