简体   繁体   中英

Encoding/converting a string with spanish or polish characters

1) How do I convert a variable with a string like "wdzi\\xc4\\x99czno\\xc5\\x9bci" into "wdzięczności" ?

2) Also how do I convert string variable with characters like "±", "ę", "Ć" into correct letters?

I emphasise "variable" because all I've got from googling was examples with " u'some string' " and the like and I can't get anything like that to work.

I use "# -*- coding: utf-8 -*- " in second line of my script and I still crash into these problems.

Also I was said that simple print should output correctly - but it does not.

In Python 2.7 IDLE, I get this output:

>>> print "wdzi\xc4\x99czno\xc5\x9bci".decode('utf-8')
wdzięczności

Your first string appears to be a UTF-8 byte string, so all that's necessary is to decode it into a Unicode string. When Python prints that string, it will encode it back to the proper encoding based on your environment.

If you're using Python 3 then you have a string that has been decoded improperly and will need a little more work to fix the damage.

>>> print("wdzi\xc4\x99czno\xc5\x9bci".encode('iso-8859-1').decode('utf-8'))
wdzięczności

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM