python 2.7中Unicode字符串到ASCII的转换

Question

I have an interesting problem.我有一个有趣的问题。

I am getting a Unicode string passed to a variable, and I want to convert it to a normal ASCII string.我将一个 Unicode 字符串传递给一个变量，我想将它转换为一个普通的 ASCII 字符串。

I can't seem to figure out how to do this in Python2.7.我似乎无法弄清楚如何在 Python2.7 中做到这一点。

The following works in Python3以下在 Python3 中工作

rawdata = '\u003c!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"\u003e'
b = bytearray()
b.extend(map(ord, rawdata))
c = ''.join(chr(i) for i in b)

If I call a print(c) , I get a nice, clean output:如果我调用print(c) ，我会得到一个漂亮、干净的输出：

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

But when I call this in Python2.7, it is still printing the Unicode escaped characters (essentially printing the rawdata variable again).但是当我在 Python2.7 中调用它时，它仍在打印 Unicode 转义字符（本质上是再次打印rawdata变量）。

What am I doing wrong?我究竟做错了什么？ There has got to be a simple call that I'm not making.必须有一个我没有拨打的简单电话。

Answer 1

So I literally found the answer 2 minutes after posting this.所以我在发布这篇文章后 2 分钟就找到了答案。

The answer is to do the following in Python 2.7答案是在 Python 2.7 中执行以下操作

rawdata = '\u003c!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"\u003e'
asciistr = rawdata.decode("raw_unicode_escape")
print asciistr

Answer 2

For better portability on both versions, you should use Unidecode , which does exactly what you want.为了在两个版本上获得更好的可移植性，您应该使用Unidecode ，它完全符合您的要求。

>>> from unidecode import unidecode
>>> unidecode(u'ko\u017eu\u0161\u010dek')
'kozuscek'
>>> unidecode(u'30 \U0001d5c4\U0001d5c6/\U0001d5c1')
'30 km/h'
>>> unidecode(u"\u5317\u4EB0")
'Bei Jing '

python 2.7中Unicode字符串到ASCII的转换

问题描述

2 个解决方案

解决方案1
1 2017-09-26 14:23:08

解决方案2
0 已采纳 2017-09-26 14:23:17

python 2.7中Unicode字符串到ASCII的转换

问题描述

2 个解决方案

解决方案1 1 2017-09-26 14:23:08

解决方案2 0 已采纳 2017-09-26 14:23:17

解决方案1
1 2017-09-26 14:23:08

解决方案2
0 已采纳 2017-09-26 14:23:17