简体繁体中英

Python string encode and decode

原文 2018-01-08 13:11:06 5 2 python/ encoding/ utf-8

Encoding in JS means converting a string with special characters to escaped usable string. like : encodeURIComponent would convert spaces to %20 etc to be usable in URIs.

So encoding here means converting to a particular format.

In Python 2.7, I have a string : 奥多比. To convert it into UTF-8 format, however, I need to use decode() function. Like: "奥多比".decode("utf-8") == u'\奥\多\比'

I want to understand how the meaning of encode and decode is changing with language. To me essentially I should be doing "奥多比".encode("utf-8")

What am I missing here.

2 answers

You appear to be confusing Unicode text (represented in Python 2 as the unicode type, indicated by the u prefix on the literal syntax), with one of the standard Unicode encodings, UTF-8.

You are not creating UTF-8, you created a Unicode text object, by decoding from a UTF-8 byte stream.

The byte string literal `"奥多比"' is a sequence of binary data, bytes. You either entered these in a text editor and saved the file as UTF-8 (and told Python to treat your source code as UTF-8 by starting the file with a PEP 263 codec header ), or you typed it into the Python interactive prompt in a terminal that was configured to send UTF-8 data.

I strongly urge you to read more about the difference between bytes, codecs and Unicode text. The following links are highly recommended:

Ned Batchelder's Pragmatic Unicode
The Python Unicode HOWTO
Joel Spolsky's The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)

In Python v2, it's type str , ie sequence of bytes. To convert it to a Unicode string, you need to decode this sequence of bytes using a codec . Simply said, it specifies how should bytes be converted to a sequence of Unicode code points. Look into Unicode HOWTO for more in-depth article on this.

encode and decode a string in python

python byte string encode and decode

Python encode() and decode() string methods

Python: encode/decode a string in Django

python 2.7 vs 3 encode decode bytes string

Is this a safe way to encode and decode a string in python?

python 2.7 to 3.3 string encode/decode changes

python encode/decode hex string to utf-8 string

Python decode and encode

python encode() and decode() issues

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question encode and decode a string in python python byte string encode and decode Python encode() and decode() string methods Python: encode/decode a string in Django python 2.7 vs 3 encode decode bytes string Is this a safe way to encode and decode a string in python? python 2.7 to 3.3 string encode/decode changes python encode/decode hex string to utf-8 string Python decode and encode python encode() and decode() issues

Related Tags

Python string encode and decode

Question

2 answers

solution1
2 2018-01-08 13:16:21

solution2
1 2018-01-08 13:15:22

Python string encode and decode

Question

2 answers

solution1 2 2018-01-08 13:16:21

solution2 1 2018-01-08 13:15:22

solution1
2 2018-01-08 13:16:21

solution2
1 2018-01-08 13:15:22