It must be a trivial task but I can't handle it. I have json that looks like this.
{'
city': u'\\u0410\\u0431\\u0430\\u043a\\u0430\\u043d',
'language':{
u'\\u0410\\u043d\\u0433\\u043b\\u0438\\u0439\\u0441\\u043a\\u0438\\u0439': 5608,
u'\\u0418\\u0442\\u0430\\u043b\\u044c\\u044f\\u043d\\u0441\\u043a\\u0438\\u0439': 98
}
},
I'm trying to convert the unicode strings into utf-8.
string=u'\u0410\u0431\u0430\u043a\u0430\u043d'
string.encode('utf-8')
I've got
'\xd0\x90\xd0\xb1\xd0\xb0\xd0\xba\xd0\xb0\xd0\xbd'
Instead of:
u'Абакан'
What am I doing wrong?
What am I doing wrong?
Not printing it.
When you just evaluate a string in Python REPL, you will get its repr
. This is '\\xd0\\x90\\xd0\\xb1\\xd0\\xb0\\xd0\\xba\\xd0\\xb0\\xd0\\xbd'
. When you print it, you will get Абакан
.
print(string.encode('utf-8'))
As @Amadan said, you just need to print your string.
But why printing string resolves the problem?
The answer is that if you type string
+ Enter this will lead to display the representation of repr()
the of the object string
; while running print string (or print (string) in Python 3.x) you will get a human readable string representation - str()
- of string
.
>>> converted = string.encode('utf8')
>>> converted
'\xd0\x90\xd0\xb1\xd0\xb0\xd0\xba\xd0\xb0\xd0\xbd'
>>> print converted
Абакан
>>> print repr(converted)
'\xd0\x90\xd0\xb1\xd0\xb0\xd0\xba\xd0\xb0\xd0\xbd'
>>> print str(converted)
Абакан
>>>
Further reading: Difference between __str__ and __repr__ in Python
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.