简体   繁体   English

如何在python中将u'\\ x96'转换为u'–'

[英]How to convert u'\x96' to u'–' in python

I'm porting content from an old Wordpress blog to Mezzanine . 我正在将内容从旧的Wordpress博客移植到夹层 I was given a json dump of the database and the posts are littered with special characters that look like this: \\x96 among otherwise unescaped html. 给了我一个数据库的json转储,并且帖子中散布着特殊字符,如下所示: \\x96 ,否则为未转义的html。

If I manually replace the slash with &# and append a semicolon the character renders correctly 如果我手动replace斜杠replace&#并附加分号,则字符将正确呈现

so \\x96 to – 因此\\x96–

escaped UTF-8(hex) to HTML Entity(hex) 将UTF-8(十六进制)转义为HTML实体(十六进制)

How to do this in Python? 如何在Python中做到这一点?

If – 如果– is also acceptable, you can use: 也可以,您可以使用:

>>> u'\x96'.encode('ascii', 'xmlcharrefreplace')
'–'

which is even called out in the documentation 1 . 甚至在文档 1中也提到了这一点。

1 (although not very clearly)... 1 (虽然不是很清楚)...

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在 Python 中将“\xF0\x9D\x96\xA7\xF0\x9D”转换为“正常”字符串 - Convert "\xF0\x9D\x96\xA7\xF0\x9D" to "normal" string in Python 1300“无效的utf8mb4字符串:'\\\\ xE2 \\\\ x96 \\\\ x88 \\\\ xE2 \\\\\\ x96 \\\\ x88 - 1300, "Invalid utf8mb4 character string: '\\xE2\\x96\\x88\\xE2\\x96\\x88 python如何将特殊字符串u'\\ x08'u'\\ x09'u'\\ x03'转换为8 9 3 - python how to convert Special string u'\x08' u'\x09' u'\x03' to 8 9 3 Tensorflow 值错误:无法为 Tensor u'InputData/X:0' 提供形状 (96, 50, 50) 的值,其形状为 '(?, 50, 50, 1)' - Tensorflow value error: Cannot feed value of shape (96, 50, 50) for Tensor u'InputData/X:0', which has shape '(?, 50, 50, 1)' 字符串编码中的Python-3和\\ x Vs \\ u Vs \\ U及其原因 - Python-3 and \x Vs \u Vs \U in string encoding and why 如何使Python 2.x Unicode字符串不打印为u'string'? - How to make Python 2.x Unicode strings not print as u'string'? utf8编解码器无法解码python中的字节0x96 - utf8 codec can't decode byte 0x96 in python 关于“utf-8”编解码器的 UnicodeDecodeError 无法在 Python 中解码字节 0x96 - UnicodeDecodeError regarding 'utf-8' codec can't decode byte 0x96 in Python 在 Python 中的不同“x”位置绘制函数 u(x,y) - Plotting a function u(x,y) at different 'x' locations in Python Python:UnicodeDecodeError:'utf-8'编解码器无法解码位置37的字节0x96:无效的起始字节 - Python: UnicodeDecodeError: 'utf-8' codec can't decode byte 0x96 in position 37: invalid start byte
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM