简体   繁体   English

如何在python中解码unicode字符串?

[英]how can i decode unicode string in python?

Wikipedia API encodes string into unicode format Wikipedia API将字符串编码为unicode格式

"Golden Globe Award for Best Motion Picture \u2013 Drama"

how can i convert it back to 我如何将其转换回

"Golden Globe Award for Best Motion Picture – Drama"

The Wikipedia API returns JSON data, use the json module to decode: Wikipedia API返回JSON数据,使用json模块进行解码:

json.loads(inputstring)

Demo: 演示:

>>> import json
>>> print json.loads('"Golden Globe Award for Best Motion Picture \u2013 Drama"')
Golden Globe Award for Best Motion Picture – Drama

If you instead have a string that starts with u'' , you already have a Python unicode value and are looking at the representation of that string: 相反,如果您有一个以u''开头的字符串,则您已经一个python unicode值,并且正在查看该字符串的表示形式:

>>> json.loads('"Golden Globe Award for Best Motion Picture \u2013 Drama"')
u'Golden Globe Award for Best Motion Picture \u2013 Drama'

Just print that value to have Python encode it to your terminal codec and represent that em-dash character in a format your terminal will understand. 只需打印该值即可让Python将其编码到您的终端编解码器中,并以终端会理解的格式表示该破折号字符。

You may want to read up about Python and Unicode and encodings before you continue, if you do not understand what the difference is between a unicode value and byte strings: 如果您不了解unicode值和字节字符串之间的区别,则可能需要继续学习Python和Unicode及其编码。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM