Python - 'ascii' codec can't decode byte

Question

I'm using Python 2.6 and Jinja2 to create HTML reports. I provide the template with many results and the template loops through them and creates HTML tables

When calling template.render, I've suddenly started getting this error.

<td>{{result.result_str}}</td>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc4 in position 0: ordinal not in range(128)

The strange thing is, even if I set result.result_str to a simple ascii string like "abc" for every result, I am still seeing this error. I'm new to Jinja2 and Python and would appreciate any ideas on how I can go about investigating the problem to get to the root cause.

Answer 1

Try to add this:

import sys
reload(sys)
sys.setdefaultencoding('utf-8')

It fixed my problem, good luck.

Answer 2

From http://jinja.pocoo.org/docs/api/#unicode

Jinja2 is using Unicode internally which means that you have to pass Unicode objects to the render function or bytestrings that only consist of ASCII characters.

So wherever you set result.result_str, you need to make it unicode, eg

result.result_str = unicode(my_string_variable, "utf8")

(If your bytes were utf8 encoded unicode)

or

result.result_str = u"my string"

Answer 3

If you get an error with a string like "ABC", maybe the non-ASCII character is somewhere else. In the template source perhaps?

In any case, use Unicode strings throughout your application to avoid this kind of problems. If your data source provides you with byte strings, you get unicode strings with byte_string.decode('utf-8') , if the string is encoded in UTF-8. If your source is a file, use the StreamReader class in the codecs module.

If you're unsure about the difference between Unicode strings and regular strings, read this: http://www.joelonsoftware.com/articles/Unicode.html

Answer 4

Just encountered the same problem in a piece of code which saves output from Jinja2 to HTML files:

with open(path, 'wb') as fh:
    fh.write(template.render(...))

It's easy to blame Jinja2, although the actual problem is in Python's open() which as of version 2.7 doesn't support UTF-8. The fix is as simple as:

import codecs
with codecs.open(path, 'wb', 'utf-8') as fh:
    fh.write(template.render(...))

Answer 5

Simple strings may contain UTF-8 character bytes but they are not of type unicode. This can be fixed by "decode" which converts str to unicode. Works in Python 2.5.5.

my_string_variable.decode("utf8")

Answer 6

ASCII is a 7-bit code. The value 0xC4 cannot be stored in 7 bits. Therefore, you are using the wrong encoding for that data.

Answer 7

Or you may do

export LANG='en_US.UTF-8'

in your console where you run the script.

Python - 'ascii' codec can't decode byte

Question

7 answers

solution1
77 2013-02-17 08:41:54

solution2
41 2011-02-18 11:29:06

solution3
20 ACCPTED 2011-02-18 11:27:47

solution4
9 2014-09-25 15:42:52

solution5
4 2011-04-17 13:24:47

solution6
0 2011-02-18 11:18:34

solution7
-1 2013-02-20 12:19:19

Python - 'ascii' codec can't decode byte

Question

7 answers

solution1 77 2013-02-17 08:41:54

solution2 41 2011-02-18 11:29:06

solution3 20 ACCPTED 2011-02-18 11:27:47

solution4 9 2014-09-25 15:42:52

solution5 4 2011-04-17 13:24:47

solution6 0 2011-02-18 11:18:34

solution7 -1 2013-02-20 12:19:19

solution1
77 2013-02-17 08:41:54

solution2
41 2011-02-18 11:29:06

solution3
20 ACCPTED 2011-02-18 11:27:47

solution4
9 2014-09-25 15:42:52

solution5
4 2011-04-17 13:24:47

solution6
0 2011-02-18 11:18:34

solution7
-1 2013-02-20 12:19:19