简体   繁体   中英

Python String encode method

In Python, there is an encode method in unicode strings to encode from unicode to byte string. There is a decode method in string to do the reverse.

But I'm confused what the encode method in string for?

It's useful for non-text codecs.

>>> 'Hello, world!'.encode('hex')
'48656c6c6f2c20776f726c6421'
>>> 'Hello, world!'.encode('base64')
'SGVsbG8sIHdvcmxkIQ==\n'
>>> 'Hello, world!'.encode('zlib')
'x\x9c\xf3H\xcd\xc9\xc9\xd7Q(\xcf/\xcaIQ\x04\x00 ^\x04\x8a'

It first decodes to Unicode using the default encoding, then encodes back to a byte string.

>>> import sys
>>> sys.getdefaultencoding()
'ascii'
>>> sys.setdefaultencoding('latin-1')
>>> '\xc4'.encode('utf-8')
'\xc3\x84'

Here, '\\xc4' is Latin-1 for Ä and '\\xc3\\x84' is UTF-8 for Ä.

Why don't you want to read the fine Python documentation yourself?

http://docs.python.org/release/2.5.2/lib/string-methods.html

""" encode( [encoding[,errors]]) Return an encoded version of the string. Default encoding is the current default string encoding. errors may be given to set a different error handling scheme. The default for errors is 'strict', meaning that encoding errors raise a UnicodeError. Other possible values are 'ignore', 'replace', 'xmlcharrefreplace', 'backslashreplace' and any other name registered via codecs.register_error, see section 4.8.1. For a list of possible encodings, see section 4.8.3. New in version 2.0. Changed in version 2.3: Support for 'xmlcharrefreplace' and 'backslashreplace' and other error handling schemes added. """

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM