UnicodeEncodeError：'ascii'编解码器无法编码字符

Question

我有一个dict，它是url响应的feed。 喜欢：

>>> d
{
0: {'data': u'<p>found "\u62c9\u67cf \u591a\u516c \u56ed"</p>'}
1: {'data': u'<p>some other data</p>'}
...
}

在此数据值（ d[0]['data'] ）上使用xml.etree.ElementTree函数时，我得到了最着名的错误消息：

UnicodeEncodeError: 'ascii' codec can't encode characters...

我应该怎么做这个Unicode字符串，使其适合ElementTree解析器？

PS。 请不要向我发送带有Unicode和Python解释的链接。 我已经很遗憾地阅读了这一切，并且无法利用它，希望其他人可以。

Answer 1

你必须手动编码为UTF-8：

ElementTree.fromstring(d[0]['data'].encode('utf-8'))

因为API仅将编码字节作为输入。 UTF-8是此类数据的良好默认值。

它将能够从那里再次解码为unicode：

>>> from xml.etree import ElementTree
>>> p = ElementTree.fromstring(u'<p>found "\u62c9\u67cf \u591a\u516c \u56ed"</p>'.encode('utf8'))
>>> p.text
u'found "\u62c9\u67cf \u591a\u516c \u56ed"'
>>> print p.text
found "拉柏 多公 园"

UnicodeEncodeError：'ascii'编解码器无法编码字符

问题描述

1 个解决方案

解决方案1
25 已采纳 2012-11-21 12:46:52

UnicodeEncodeError：&#39;ascii&#39;编解码器无法编码字符

问题描述

1 个解决方案

解决方案1 25 已采纳 2012-11-21 12:46:52

UnicodeEncodeError：'ascii'编解码器无法编码字符

解决方案1
25 已采纳 2012-11-21 12:46:52