UnicodeDecodeError：'ascii'编解码器无法与'\\ xe8'一起解码'\\ xc3 \\ xa8'

Question

I am having this strange problem below: 我在下面遇到这个奇怪的问题：

>>> a=u'Pal-Andr\xe8'
>>> b='Pal-Andr\xc3\xa8'
>>> print "%s %s" % (a,b) # boom
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 8: ordinal not in range(128)
>>> print "%s" % a
Pal-Andrè
>>> print "%s" % b
Pal-Andrè

Where I can print a , b separately but not both. 我可以分别打印a ， b但不能同时打印两者。

What's the problem? 有什么问题？ How can I print them both? 我怎么打印它们？

Answer 1

The actual problem is 实际问题是

b = 'Pal-Andr\xc3\xa8'

Now, b has a string literal not a unicode literal. 现在， b有一个字符串文字而不是unicode文字。 So, when you are printing them as strings separately, a is treated as a Unicode String and b is treated as a normal string. 因此，当您将它们分别打印为字符串时， a被视为Unicode字符串， b被视为普通字符串。

>>> "%s" % a
u'Pal-Andr\xe8'
>>> "%s" % b
'Pal-Andr\xc3\xa8'

Note the u at the beginning is missing. 请注意，开头的u缺失。 You can confirm further 你可以进一步确认

>>> type("%s" % b)
<type 'str'>
>>> type("%s" % a)
<type 'unicode'>

But when you are printing them together, string becomes a unicode string and \\xc3 is not a valid ASCII code and that is why the code is failing. 但是当你一起打印它们时，字符串变成unicode字符串而\\xc3不是有效的ASCII代码，这就是代码失败的原因。

To fix it, you simply have to declare b also as a unicode literal, like this 要修复它，你只需将b声明为unicode文字，就像这样

>>> a=u'Pal-Andr\xe8'
>>> b=u'Pal-Andr\xc3\xa8'
>>> "%s" % a
u'Pal-Andr\xe8'
>>> "%s" % b
u'Pal-Andr\xc3\xa8'
>>> "%s %s" % (a, b)
u'Pal-Andr\xe8 Pal-Andr\xc3\xa8'

Answer 2

I am not sure what the real issue here, but one thing for sure a is a unicode string and b is a string. 我不确定这里有什么真正的问题，但有一点可以确定a是unicode字符串而b是字符串。

You will have to encode or decode one of them before print them both. 在打印它们之前，您必须对其中一个进行编码或解码。

Here is an example. 这是一个例子。

>>> b = b.decode('utf-8') 
>>> print u"%s %s" % (a,b)
Pal-Andrè Pal-Andrè

Answer 3

Having a mix of Unicode and byte strings makes the combined print try to promote everything to Unicode strings. 混合使用Unicode和字节字符串会使组合打印尝试将所有内容提升为Unicode字符串。 You've got to decode the byte string with the correct codec, else Python 2 will default to ascii . 您必须使用正确的编解码器解码字节字符串，否则Python 2将默认为ascii 。 b is a byte string encoded in UTF-8. b是以UTF-8编码的字节串。 The format string is promoted as well, but it happens to work decoded from ASCII. 格式字符串也被提升，但它恰好从ASCII解码。 Best to use Unicode everywhere: 最好在所有地方使用Unicode：

>>> print u'%s %s' % (a,b.decode('utf8'))
Pal-Andrè Pal-Andrè

UnicodeDecodeError：'ascii'编解码器无法与'\\ xe8'一起解码'\\ xc3 \\ xa8'

问题描述

3 个解决方案

解决方案1
2 已采纳 2015-01-22 14:56:14

解决方案2
0 2015-01-22 15:09:42

解决方案3
0 2015-01-22 19:16:22

UnicodeDecodeError：&#39;ascii&#39;编解码器无法与&#39;\\ xe8&#39;一起解码&#39;\\ xc3 \\ xa8&#39;

问题描述

3 个解决方案

解决方案1 2 已采纳 2015-01-22 14:56:14

解决方案2 0 2015-01-22 15:09:42

解决方案3 0 2015-01-22 19:16:22

UnicodeDecodeError：'ascii'编解码器无法与'\\ xe8'一起解码'\\ xc3 \\ xa8'

解决方案1
2 已采纳 2015-01-22 14:56:14

解决方案2
0 2015-01-22 15:09:42

解决方案3
0 2015-01-22 19:16:22