Unicode encoding when reading from text file

Question

I hope you can help.

I'm trying to take a string and check whether or not it is in a text file called PasswordList. This is the code I have written to do this:

Password = input('Enter a password: ')    
with open('PasswordList.txt') as f:
    Found = False
    for line in f:
        if Password in line: 
            print(line)
            Found = True
    if not Found:
        print('Password is not in list')

If I put in something like the letter "e", it will return the lines which contain it until it hits position 4583 where it returns an error:

UnicodeDecodeError: 'ascii' codec can't decode byte 0x82 in position 4853: ordinal not in range(128).

I guess that it has to do with encoding between ascii and unicode, as in Python is trying to use the ascii codec to decode a unicode character?

If I try

print (str((sys.getdefaultencoding())))

Then I get "utf-8" as the default encoding.

I'm stuck, what can I do?

Answer 1

Opening the file with the io module:

import io
with io.open('PasswordList.txt', encoding='cp1252') as f:
    ...

However, you do need to know what encoding the data is in. The file itself usually doesn't contain this information, you have to know how it was created.

Answer 2

To determine the encoding of a file created with Notepad, open the file in Notepad. Select File | Save as from the menu. Near the bottom of the dialog, the current encoding appears in a dropdown (screenshot attached).

Now you can try using codecs.open as suggested by wim.

Unicode encoding when reading from text file

Question

2 answers

solution1
2 ACCPTED 2015-11-24 02:40:46

solution2
0 2015-11-24 02:52:15

Unicode encoding when reading from text file

Question

2 answers

solution1 2 ACCPTED 2015-11-24 02:40:46

solution2 0 2015-11-24 02:52:15

solution1
2 ACCPTED 2015-11-24 02:40:46

solution2
0 2015-11-24 02:52:15