utf-8中的汉字字符

Question

>>> s='未作評級'
>>> s
'\xe6\x9c\xaa\xe4\xbd\x9c\xe8\xa9\x95\xe7\xb4\x9a'
>>> s = unicode(s)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe6 in position 0: ordinal not in range(128)

How would I get the 未作評級 into uniciode? 如何将未作評級变为uniciode？

Answer 1

Either use a Unicode string from the start: 从一开始就使用Unicode字符串：

>>> s = u'未作評級'

or decode the string from its current encoding (which appears to be UTF-8). 或者从当前编码（看起来是UTF-8）解码字符串。 Then you get a Unicode string. 然后你得到一个Unicode字符串。

>>> s = '未作評級'.decode("utf-8")

utf-8中的汉字字符

问题描述

1 个解决方案

解决方案1
6 已采纳 2013-07-25 19:47:20

utf-8中的汉字字符

问题描述

1 个解决方案

解决方案1 6 已采纳 2013-07-25 19:47:20

解决方案1
6 已采纳 2013-07-25 19:47:20