[英]Python unicode character conversion for Emoji
I'm having some issues with formatting a byte ordered mark to unicode. 我在将字节有序标记格式化为unicode时遇到了一些问题。 There is some oddness coming in with how my character is being expressed.
我的角色表达方式有些奇怪。 Basically it's not printing an emoji character in Python, instead it's just the string.
基本上它不是在Python中打印表情符号字符,而只是字符串。 Here's my example.
这是我的例子。
# these codes are coming from a json file; this a representation of one of the codes.
e = 'U+1F600' # smile grin emoji
# not sure how to clean this, so here's a basic attempt using regex.
b = re.compile(r'U\+', re.DOTALL).sub('\U000', e)
print unicode(b) # output should be '\U0001F600'
For whatever reason this doesn't print an emoji character. 无论出于何种原因,这都不会打印出表情符号字符。
However if you type out the same string as a literal, using the u
flag everything works as expected. 但是,如果您输入与文字相同的字符串,使用
u
标志一切都按预期工作。
print u'\U0001F600'
What am I doing wrong here? 我在这做错了什么? I thought that the
unicode
function would convert my string to the working equivalent, but it apparently is not. 我认为
unicode
函数会将我的字符串转换为工作等效字符,但显然不是。
I'm using Python 2.7 我正在使用Python 2.7
I guess decode
is what you are looking for, 我想
decode
正是你要找的,
>>> b = '\U0001F600'
>>> print b.decode('unicode-escape')
😀
or 要么
>>> print unicode(b, 'unicode-escape')
😀
The issue with 这个问题
print unicode(b)
is that the unicode
function tries to convert the string \\U0001F600
to unicode which results in \\\\U0001F600
. 是
unicode
函数尝试将字符串\\U0001F600
转换为unicode,导致\\\\U0001F600
。 To prevent this we provide the current encoding as unicode-escape
为了防止这种情况,我们将当前编码提供为
unicode-escape
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.