简体   繁体   中英

Encode/Decode error using utf-8

Ho would I properly encode the following:

# # -*- coding: utf-8 -*-

>>> 'What\x80\x99s Up: Balloon to the Rescue!'.encode('utf-8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 4: ordinal not in range(128)
>>> 'What\x80\x99s Up: Balloon to the Rescue!'.decode('utf-8')
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
 File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 4: invalid start byte

You've got two issues here. First, your UTF-8 byte sequence is wrong; it should be \\xe2\\x80\\x99 . You are also using the wrong function; you need to decode it from UTF-8:

>>> print 'What\xe2\x80\x99s Up: Balloon to the Rescue!'.decode('utf-8')
What’s Up: Balloon to the Rescue!
>>> type('What\x80\x99s Up: Balloon to the Rescue!')
<type 'str'>

So you can't encode it since it is not Unicode.

What is your Unicode input?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM