简体   繁体   English

表情符号的Python unicode字符转换

[英]Python unicode character conversion for Emoji

I'm having some issues with formatting a byte ordered mark to unicode. 我在将字节有序标记格式化为unicode时遇到了一些问题。 There is some oddness coming in with how my character is being expressed. 我的角色表达方式有些奇怪。 Basically it's not printing an emoji character in Python, instead it's just the string. 基本上它不是在Python中打印表情符号字符,而只是字符串。 Here's my example. 这是我的例子。

# these codes are coming from a json file; this a representation of one of the codes.
e = 'U+1F600' # smile grin emoji

# not sure how to clean this, so here's a basic attempt using regex.
b = re.compile(r'U\+', re.DOTALL).sub('\U000', e)

print unicode(b) # output should be '\U0001F600'

For whatever reason this doesn't print an emoji character. 无论出于何种原因,这都不会打印出表情符号字符。

However if you type out the same string as a literal, using the u flag everything works as expected. 但是,如果您输入与文字相同的字符串,使用u标志一切都按预期工作。

print u'\U0001F600'

What am I doing wrong here? 我在这做错了什么? I thought that the unicode function would convert my string to the working equivalent, but it apparently is not. 我认为unicode函数会将我的字符串转换为工作等效字符,但显然不是。

I'm using Python 2.7 我正在使用Python 2.7

I guess decode is what you are looking for, 我想decode正是你要找的,

>>> b = '\U0001F600'
>>> print b.decode('unicode-escape')
😀

or 要么

>>> print unicode(b, 'unicode-escape')
😀

The issue with 这个问题

print unicode(b)

is that the unicode function tries to convert the string \\U0001F600 to unicode which results in \\\\U0001F600 . unicode函数尝试将字符串\\U0001F600转换为unicode,导致\\\\U0001F600 To prevent this we provide the current encoding as unicode-escape 为了防止这种情况,我们将当前编码提供为unicode-escape

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM