如何在python中将u'\\ x96'转换为u'–'

Question

I'm porting content from an old Wordpress blog to Mezzanine . 我正在将内容从旧的Wordpress博客移植到夹层。 I was given a json dump of the database and the posts are littered with special characters that look like this: \\x96 among otherwise unescaped html. 给了我一个数据库的json转储，并且帖子中散布着特殊字符，如下所示： \\x96 ，否则为未转义的html。

If I manually replace the slash with &# and append a semicolon the character renders correctly 如果我手动replace斜杠replace为&#并附加分号，则字符将正确呈现

so \\x96 to  因此\\x96至

escaped UTF-8(hex) to HTML Entity(hex) 将UTF-8（十六进制）转义为HTML实体（十六进制）

How to do this in Python? 如何在Python中做到这一点？

Answer 1

If  如果 is also acceptable, you can use: 也可以，您可以使用：

>>> u'\x96'.encode('ascii', 'xmlcharrefreplace')
'&#150;'

which is even called out in the documentation ¹ . 甚至在文档 ^1中也提到了这一点。

^{¹ (although not very clearly)...} ^{¹ （虽然不是很清楚）...}

如何在python中将u'\\ x96'转换为u'–'

问题描述

1 个解决方案

解决方案1
1 已采纳 2014-04-04 05:56:16

如何在python中将u&#39;\\ x96&#39;转换为u&#39;–&#39;

问题描述

1 个解决方案

解决方案1 1 已采纳 2014-04-04 05:56:16

如何在python中将u'\\ x96'转换为u'–'

解决方案1
1 已采纳 2014-04-04 05:56:16