简体   繁体   English

如何在Python中将UTF8十六进制转换为Unicode代码点

[英]How to convert UTF8 hex to Unicode codepoint in python

I'm making a config file that contains the map of emoji's Unicode and SoftBank Unicode. 我正在制作一个包含表情符号Unicode和SoftBank Unicode映射的配置文件。 Now I'm using a python program to scrach this information from http://punchdrunker.github.com/iOSEmoji/table_html/ios6/index.html 现在,我正在使用python程序从http://punchdrunker.github.com/iOSEmoji/table_html/ios6/index.html抓取此信息

but there is a problem , the SoftBank Code on the web page is UTF8 hex, not Unicode codepoint , how to change it to Unicode codePoint? 但是有一个问题,网页上的SoftBank代码为UTF8十六进制,而不是Unicode codepoint,如何将其更改为Unicode codePoint?

for example , I want to change EE9095 to E415 (the first emoji emotion) 例如,我想将EE9095更改为E415(第一个表情符号情感)

I try to do it like this , but it just didn't work 我尝试这样做,但是没有用

code.decode('utf-8')

but it just didn't work, the code is the same, didn't change. 但是它没有用,代码是一样的,没有变化。 the unix command iconv didn't work too Unix命令iconv也不起作用

Are you sure code is actually encoded in UTF-8? 您确定code实际上是以UTF-8编码的吗? This works for me: 这对我有用:

>>> b'\xee\x90\x95'.decode('utf-8')
u'\ue415'

How about this: 这个怎么样:

>>> 'EE9095'.decode('hex').decode('utf-8')
<<< u'\ue415'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM