Converting string of unicode emoji

Question

I have a list of strings that basically represent unicode emojis, eg:

emoji[0] = 'U+270DU+1F3FF'

I would like to convert this "almost" unicode emoji representation to its true emoji representation so that I can search through text documents that contain these emojis, eg:

emoji[0] = emoji[0].replace('U+', '\U000')
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-4: truncated \UXXXXXXXX escape

How can I accomplish that?

Answer 1

A solution that would work with variable digit representations:

>>> import re
>>> e = 'U+270DU+1F3FF'
>>> def emojize(match):
...     return chr(int(match.group(0)[2:], 16))
>>> re.sub(r"U\+[0-9A-F]+", emojize, e)
'✍🏿'

Answer 2

This is because you have 4 digits in 270D and 5 in 1F3FF :

>>> e = 'U+270D'
>>> print e.replace('U+', '\U0000').decode('unicode-escape')
✍
>>> e = 'U+1F3FF'
>>> print e.replace('U+', '\U000').decode('unicode-escape')
🏿

Converting string of unicode emoji

Question

2 answers

solution1
3 ACCPTED 2017-12-14 14:04:47

solution2
2 2017-12-14 13:55:48

Converting string of unicode emoji

Question

2 answers

solution1 3 ACCPTED 2017-12-14 14:04:47

solution2 2 2017-12-14 13:55:48

solution1
3 ACCPTED 2017-12-14 14:04:47

solution2
2 2017-12-14 13:55:48