简体   繁体   中英

UnicodeDecodeError: 'ascii' codec can't decode '\xc3\xa8' together with '\xe8'

I am having this strange problem below:

>>> a=u'Pal-Andr\xe8'
>>> b='Pal-Andr\xc3\xa8'
>>> print "%s %s" % (a,b) # boom
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 8: ordinal not in range(128)
>>> print "%s" % a
Pal-Andrè
>>> print "%s" % b
Pal-Andrè

Where I can print a , b separately but not both.

What's the problem? How can I print them both?

The actual problem is

b = 'Pal-Andr\xc3\xa8'

Now, b has a string literal not a unicode literal. So, when you are printing them as strings separately, a is treated as a Unicode String and b is treated as a normal string.

>>> "%s" % a
u'Pal-Andr\xe8'
>>> "%s" % b
'Pal-Andr\xc3\xa8'

Note the u at the beginning is missing. You can confirm further

>>> type("%s" % b)
<type 'str'>
>>> type("%s" % a)
<type 'unicode'>

But when you are printing them together, string becomes a unicode string and \\xc3 is not a valid ASCII code and that is why the code is failing.

To fix it, you simply have to declare b also as a unicode literal, like this

>>> a=u'Pal-Andr\xe8'
>>> b=u'Pal-Andr\xc3\xa8'
>>> "%s" % a
u'Pal-Andr\xe8'
>>> "%s" % b
u'Pal-Andr\xc3\xa8'
>>> "%s %s" % (a, b)
u'Pal-Andr\xe8 Pal-Andr\xc3\xa8'

I am not sure what the real issue here, but one thing for sure a is a unicode string and b is a string.

You will have to encode or decode one of them before print them both.

Here is an example.

>>> b = b.decode('utf-8') 
>>> print u"%s %s" % (a,b)
Pal-Andrè Pal-Andrè

Having a mix of Unicode and byte strings makes the combined print try to promote everything to Unicode strings. You've got to decode the byte string with the correct codec, else Python 2 will default to ascii . b is a byte string encoded in UTF-8. The format string is promoted as well, but it happens to work decoded from ASCII. Best to use Unicode everywhere:

>>> print u'%s %s' % (a,b.decode('utf8'))
Pal-Andrè Pal-Andrè

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM