简体   繁体   English

UnicodeDecodeError:'ascii'编解码器无法与'\\ xe8'一起解码'\\ xc3 \\ xa8'

[英]UnicodeDecodeError: 'ascii' codec can't decode '\xc3\xa8' together with '\xe8'

I am having this strange problem below: 我在下面遇到这个奇怪的问题:

>>> a=u'Pal-Andr\xe8'
>>> b='Pal-Andr\xc3\xa8'
>>> print "%s %s" % (a,b) # boom
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 8: ordinal not in range(128)
>>> print "%s" % a
Pal-Andrè
>>> print "%s" % b
Pal-Andrè

Where I can print a , b separately but not both. 我可以分别打印ab但不能同时打印两者。

What's the problem? 有什么问题? How can I print them both? 我怎么打印它们?

The actual problem is 实际问题是

b = 'Pal-Andr\xc3\xa8'

Now, b has a string literal not a unicode literal. 现在, b有一个字符串文字而不是unicode文字。 So, when you are printing them as strings separately, a is treated as a Unicode String and b is treated as a normal string. 因此,当您将它们分别打印为字符串时, a被视为Unicode字符串, b被视为普通字符串。

>>> "%s" % a
u'Pal-Andr\xe8'
>>> "%s" % b
'Pal-Andr\xc3\xa8'

Note the u at the beginning is missing. 请注意,开头的u缺失。 You can confirm further 你可以进一步确认

>>> type("%s" % b)
<type 'str'>
>>> type("%s" % a)
<type 'unicode'>

But when you are printing them together, string becomes a unicode string and \\xc3 is not a valid ASCII code and that is why the code is failing. 但是当你一起打印它们时,字符串变成unicode字符串而\\xc3不是有效的ASCII代码,这就是代码失败的原因。

To fix it, you simply have to declare b also as a unicode literal, like this 要修复它,你只需将b声明为unicode文字,就像这样

>>> a=u'Pal-Andr\xe8'
>>> b=u'Pal-Andr\xc3\xa8'
>>> "%s" % a
u'Pal-Andr\xe8'
>>> "%s" % b
u'Pal-Andr\xc3\xa8'
>>> "%s %s" % (a, b)
u'Pal-Andr\xe8 Pal-Andr\xc3\xa8'

I am not sure what the real issue here, but one thing for sure a is a unicode string and b is a string. 我不确定这里有什么真正的问题,但有一点可以确定a是unicode字符串而b是字符串。

You will have to encode or decode one of them before print them both. 在打印它们之前,您必须对其中一个进行编码或解码。

Here is an example. 这是一个例子。

>>> b = b.decode('utf-8') 
>>> print u"%s %s" % (a,b)
Pal-Andrè Pal-Andrè

Having a mix of Unicode and byte strings makes the combined print try to promote everything to Unicode strings. 混合使用Unicode和字节字符串会使组合打印尝试将所有内容提升为Unicode字符串。 You've got to decode the byte string with the correct codec, else Python 2 will default to ascii . 您必须使用正确的编解码器解码字节字符串,否则Python 2将默认为ascii b is a byte string encoded in UTF-8. b是以UTF-8编码的字节串。 The format string is promoted as well, but it happens to work decoded from ASCII. 格式字符串也被提升,但它恰好从ASCII解码。 Best to use Unicode everywhere: 最好在所有地方使用Unicode:

>>> print u'%s %s' % (a,b.decode('utf8'))
Pal-Andrè Pal-Andrè

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 UnicodeDecodeError:&#39;ascii&#39;编解码器无法解码字节0xa3 - UnicodeDecodeError: 'ascii' codec can't decode byte 0xa3 UnicodeDecodeError:&#39;ascii&#39;编解码器无法解码字节0xc5 - UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 UnicodeDecodeError: &#39;ascii&#39; 编解码器无法解码字节 0xc2 - UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 Python 3 UnicodeDecodeError:“ascii”编解码器无法解码字节 0xc2 - Python 3 UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 UnicodeDecodeError:“ ascii”编解码器无法解码字节0xe4 - UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 UnicodeDecodeError:“ ascii”编解码器无法解码字节0xe3 - UnicodeDecodeError: 'ascii' codec can't decode byte 0xe3 UnicodeDecodeError:&#39;ascii&#39;编解码器无法解码位置0的字节0xa0:序数不在范围内(128) - UnicodeDecodeError: 'ascii' codec can't decode byte 0xa0 in position 0: ordinal not in range(128) UnicodeDecodeError:“ascii”编解码器无法解码 position 0 中的字节 0xa7:不在序数范围内 (128) - UnicodeDecodeError: 'ascii' codec can't decode byte 0xa7 in position 0: not in ordinal range (128) python exceptions.UnicodeDecodeError:&#39;ascii&#39;编解码器无法解码字节0xa7 - python exceptions.UnicodeDecodeError: 'ascii' codec can't decode byte 0xa7 in UnicodeDecodeError:&#39;ascii&#39;编解码器无法解码 - UnicodeDecodeError: 'ascii' codec can't decode
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM