简体   繁体   中英

Python, UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 1718: ordinal not in range(128)

I am trying a simple parsing of a file and get the error due to special characters:

#!/usr/bin/env python                                                                                                                 
# -*- coding: utf-8 -*-                                                                                                               

infile = 'finance.txt'
input = open(infile)
for line in input:
  if line.startswith(u'▼'):

I get the error:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 1718: ordinal not in range(128)

Solution?

You need to provide the encoding. For example if it is utf-8 :

import io

with io.open(infile, encoding='utf-8') as fobj:
    for line in fobj:
        if line.startswith(u'▼'):

This works for Python 2 and 3. Per default Python 2 opens files assuming no encoding, ie reading the content will return byte strings. Therefore, you can read only ascii characters. In Python 3 the default is what locale.getpreferredencoding(False) returns, in many cases utf-8 . The standard open() in Python 2 does not allow to specify an encoding. Using io.open() makes it future proof because you don't need to change your code when switching to Python 3.

In Python 3:

>>> io.open is open
True

Open your file with the correct encoding, for example if your file is UTF8 encoded with Python 3:

with open('finance.txt', encoding='utf8') as f:
    for line in input:
        if line.startswith(u'▼'):
            # whatever

With Python 2 you can use io.open() (also works in Python 3):

import io

with io.open('finance.txt', encoding='utf8') as f:
    for line in input:
        if line.startswith(u'▼'):
            # whatever

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM