简体   繁体   English

UnicodeDecodeError:'ascii'编解码器无法解码字节0xa3

[英]UnicodeDecodeError: 'ascii' codec can't decode byte 0xa3

I got this string 'Velcro Back Rest \\xa36.99' . 我得到了这个字符串'Velcro Back Rest \\xa36.99' Note it does not have u in the front. 注意它在前面没有u Its just plain ascii. 它只是简单的ascii。

How do I convert it to unicode? 如何将其转换为unicode?

I tried this, 我试过这个,

>>> unicode('Velcro Back Rest \xa36.99')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xa3 in position 17: ordinal not in range(128)

This answer explain it nicely. 这个答案很好地解释了。 But I have same question as the OP of that question. 但我和那个问题的OP有同样的问题。 In the answer to that comment Winston says "You should not encoding a string object ..." 在评论的答案中,温斯顿说:“你不应该编码一个字符串对象...”

But the framework I am working requires that it should be converted unicode string. 但我正在工作的框架要求它应该转换为unicode字符串。 I use scrapy and I have this line. 我使用scrapy而且我有这条线。

loader.add_value('name', product_name)

Here product_name contains that problematic string and it throws the error. 这里product_name包含有问题的字符串,它会抛出错误。

You need to specify an encoding to decode the bytes to Unicode with: 您需要指定一个编码来将字节解码为Unicode:

>>> 'Velcro Back Rest \xa36.99'.decode('latin1')
u'Velcro Back Rest \xa36.99'
>>> print 'Velcro Back Rest \xa36.99'.decode('latin1')
Velcro Back Rest £6.99

In this case, I was able to guess the encoding from experience, you need to provide the correct codec used for each encoding you encounter. 在这种情况下,我能够从经验中猜测编码,您需要为遇到的每个编码提供正确的编解码器。 For web data, that is usually included in the from of the content-type header: 对于Web数据,通常包含在content-type标头的from中:

Content-Type: text/html; charset=iso-8859-1

where iso-8859-1 is the official standard name for the Latin 1 encoding, for example. 例如, iso-8859-1是Latin 1编码的官方标准名称。 Python recognizes latin1 as an alias for iso-8859-1 . Python将latin1识别为iso-8859-1的别名。

Note that your input data is not plain ASCII. 请注意,您的输入数据不是纯ASCII。 If it was, it'd only use bytes in the range 0 through to 127; 如果是,它只使用0到127范围内的字节; \\xa3 is 163 decimal, so outside of the ASCII range. \\xa3是十六进制的163,因此在ASCII范围之外。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 UnicodeDecodeError:“utf-8”编解码器无法解码 position 886 中的字节 0xa3:无效的起始字节:jsonlines - UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa3 in position 886: invalid start byte: jsonlines UnicodeDecodeError:“ utf8”编解码器无法解码位置3的字节0xa3:无效的起始字节 - UnicodeDecodeError: 'utf8' codec can't decode byte 0xa3 in position 3: invalid start byte UnicodeDecodeError:“ ascii”编解码器无法解码字节 - UnicodeDecodeError: 'ascii' codec can't decode byte UnicodeDecodeError:&#39;ascii&#39;编解码器无法解码位置0的字节0xa0:序数不在范围内(128) - UnicodeDecodeError: 'ascii' codec can't decode byte 0xa0 in position 0: ordinal not in range(128) UnicodeDecodeError:“ascii”编解码器无法解码 position 0 中的字节 0xa7:不在序数范围内 (128) - UnicodeDecodeError: 'ascii' codec can't decode byte 0xa7 in position 0: not in ordinal range (128) python exceptions.UnicodeDecodeError:&#39;ascii&#39;编解码器无法解码字节0xa7 - python exceptions.UnicodeDecodeError: 'ascii' codec can't decode byte 0xa7 in “utf-8”编解码器无法解码位置 28 中的字节 0xa3:无效的起始字节 - 'utf-8' codec can't decode byte 0xa3 in position 28: invalid start byte UnicodeDecodeError:&#39;ascii&#39;编解码器无法解码字节0x8b - UnicodeDecodeError: 'ascii' codec can't decode byte 0x8b UnicodeDecodeError: &#39;ascii&#39; 编解码器无法解码字节 0xc2 - UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 UnicodeDecodeError:“ ascii”编解码器无法在Python中解码字节 - UnicodeDecodeError: 'ascii' codec can't decode byte in Python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM