str.encode期望輸入什么？

Question

我希望在我的項目中為所有字符串使用unicode而不是str 。 我正在嘗試使用str.encode方法，但無法從文檔中看出encode方法完全或期望作為輸入的內容。

希臘小寫字母pi是U + 03C0，當以UTF-8編碼時是0xCF 0x80。 我得到以下內容：

>>> s1 = '\xcf\x80'
>>> s1.encode('utf-8','ignore')

Traceback (most recent call last):
  File "<pyshell#61>", line 1, in <module>
    s1.encode('utf-8','ignore')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xcf in position 0: ordinal not in range(128)

我嘗試過：

>>> s2='\x03\xc0'

>>> s2.encode('utf-8','ignore')

Traceback (most recent call last):
  File "<pyshell#62>", line 1, in <module>
    s2.encode('utf-8','ignore')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc0 in position 1: ordinal not in range(128)

encode期望什么作為輸入，為什么'ignore'選項不會忽略錯誤？ 我試過'替換'，這也沒有掩蓋錯誤。

Answer 1

在Python 2.x中， str是一個字節字符串（已編碼）。 您可以將其解碼為unicode對象：

>>> s1 = '\xcf\x80'  # string literal (str)
>>> s1.decode('utf-8')
u'\u03c0'

對於unicode對象，您可以執行編碼：

>>> u1 = u'\u03c0'  # unicode literal (unicode)  U+03C0
>>> u1.encode('utf-8')
'\xcf\x80'

str.encode期望輸入什么？

問題描述

1 個解決方案

解決方案1
3 已采納 2015-01-02 05:07:42

str.encode期望輸入什么？

問題描述

1 個解決方案

解決方案1 3 已采納 2015-01-02 05:07:42

解決方案1
3 已采納 2015-01-02 05:07:42