简体   繁体   中英

python print utf-8 character stored in unicode string

I'm reading pickle file retrieved from some library. there were a lot of utf-8 characters stored in unicode string. For example:

u'\xc4\x91' #đ
u'\xc3\xad' #í
u'\xc3\u017d' #�\u017d
...

I can encode and display most of them using raw_unicode_escape However all the characters with \\u\u003c/code> escape like the third one above are not displayed properly: . How can I fix that? EDIT: Each string above should be a character

EDIT 2: The code I use to read the file

model_dir = '../../projects/python/test/model-5'
with open(model_dir, 'rb') as f:
    model = pickle.load(f)
seq = model.sequitur
rightI = seq.rightInventory
print repr(rightI.list) 

the result contains something similar to above examples

试试这个也许

PYTHONIOENCODING="utf8" python script.py

You have a unicode escape string. If you print it, and your console's font and encoding support it you will see the following:

>>> sys.stdout.encoding
'UTF-8'
>>> sys.getfilesystemencoding()
'UTF-8'
>>> i
[u'\xc4\x91', u'\xc3\xad', u'\xc3\u017d']
>>> for q in i:
...   print(q)
...
Ä
í
ÃŽ

To make sure they are rendered (printed on the screen) properly, you need to make sure that:

  • the encoding of the file is correct for the data that is entered in the file.
  • the encoding and font of the terminal supports the glyphs

If you see it means that the encoding declared for the application does not support that specific code point, so the system does not know how to render it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM