UnicodeEncodeError: 'ascii' codec can't encode character u'\–' in position 3 2: ordinal not in range(128)

Question

我正在使用 xlrd 解析 XSL 文件。 大多數事情都運行良好。 我有一個字典，其中鍵是字符串，值是字符串列表。 所有的鍵和值都是 Unicode。 我可以使用str()方法打印大多數鍵和值。 但是某些值具有 Unicode 字符\– ，我收到了上述錯誤。

我懷疑這是因為這是嵌入在 Unicode 中的 Unicode，而 Python 解釋器無法對其進行解碼。 那么我怎樣才能擺脫這個錯誤呢？

Answer 1

你也可以打印 Unicode 對象，你不需要在它周圍做 str() 。

假設你真的想要一個 str：

當您執行 str(u'\–') 時，您正在嘗試將 Unicode 字符串轉換為 8 位字符串。 為此，您需要使用編碼，即 Unicode 數據到 8 位數據之間的映射。 str() 所做的是使用系統默認編碼，在 Python 2 下是 ASCII。 ASCII 僅包含 Unicode 的前 127 個代碼點，即 \ 到 \1。 結果是你得到了上面的錯誤，ASCII 編解碼器不知道 \– 是什么（它是一個長破折號，順便說一句）。

因此，您需要指定要使用的編碼。 常見的有 ISO-8859-1，最常被稱為 Latin-1，它包含 256 個第一個代碼點； UTF-8，可以使用變長編碼對所有碼點進行編碼，Windows 上常見的CP1252，以及各種中文和日文編碼。

你像這樣使用它們：

u'\u2013'.encode('utf8')

結果是一個包含字節序列的 str ，它是相關字符的 uTF8 表示：

'\xe2\x80\x93'

你可以打印它：

>>> print '\xe2\x80\x93'
–

Answer 2

你也可以試試這個來獲取文本。

foo.encode('ascii', 'ignore')

Answer 3

由於這里str(u'\–')導致錯誤，因此使用isinstance(foo,basestring)檢查 unicode/string，如果不是 base string 類型，則將其轉換為 Unicode，然后應用 encode

if isinstance(foo,basestring):
    foo.encode('utf8')
else:
    unicode(foo).encode('utf8')

進一步閱讀

Answer 4

我有同樣的問題。 這對我來說很好：

str(objdata).encode('utf-8')

Answer 5

我在最近的一個項目中遇到了這個問題，這確實是一個痛苦的問題。 我終於發現這是因為我們在 Docker 中使用的 Python 編碼為“ansi_x3.4-1968”而不是“utf-8”。 因此，如果有人在使用 Docker 並遇到此錯誤，請按照以下步驟操作可能會徹底解決您的問題。

在 Dockerfile 的同一目錄中創建一個文件並將其命名為default_locale ，將這一行放入其中，
environment=LANG="es_ES.utf8", LC_ALL="es_ES.UTF-8", LC_LANG="es_ES.UTF-8"
將這些添加到您的 Dockerfile 中，
運行 apt-get clean && apt-get update && apt-get install -y locales
RUN locale-gen en_CA.UTF-8
復制 ./default_locale /etc/default/locale
運行 chmod 0755 /etc/default/locale
ENV LC_ALL=en_CA.UTF-8
ENV LANG=en_CA.UTF-8
ENV LANGUAGE=en_CA.UTF-8

當我再次構建和運行 Docker 時，這徹底解決了我的問題，希望這也能解決您的問題。

Answer 6

對我來說這有效

unicode(data).encode('utf-8')

Answer 7

我仍然嘗試了所有方法，但仍在點上，在python3環境中出現錯誤（可能是由於使用了python2中不推薦使用的python2函數，但可以在python2上正常工作），我嘗試了以下技術來處理特殊字符：

try:
    file_writer.write(data)
except:
    data = data.encode('ascii', 'ignore').decode('utf-8')
    file_writer.write(data)

Answer 8

首先在此鏈接中找出什么字符是 unicode https://unicode-table.com/en/2013/

然后在代碼中使用這個：

{your-string-variable}.replace(u"\u2013", "-")

對於所有有錯誤的 unicodes 也是如此。

UnicodeEncodeError: 'ascii' codec can't encode character u'\–' in position 3 2: ordinal not in range(128)

問題描述

7 個解決方案

解決方案1
81 2011-03-22 07:20:17

解決方案2
29 2015-01-14 17:18:32

解決方案3
7 2015-01-05 08:40:11

解決方案4
5 2017-04-28 18:08:08

解決方案5
2 2019-11-09 15:34:27

解決方案6
0 2017-11-21 10:13:28

解決方案7
0 2019-11-05 11:26:56

解決方案8
0 2021-08-04 07:43:32

UnicodeEncodeError: &#39;ascii&#39; codec can&#39;t encode character u&#39;\–&#39; in position 3 2: ordinal not in range(128)

問題描述

7 個解決方案

解決方案1 81 2011-03-22 07:20:17

解決方案2 29 2015-01-14 17:18:32

解決方案3 7 2015-01-05 08:40:11

解決方案4 5 2017-04-28 18:08:08

解決方案5 2 2019-11-09 15:34:27

解決方案6 0 2017-11-21 10:13:28

解決方案7 0 2019-11-05 11:26:56

解決方案8 0 2021-08-04 07:43:32

UnicodeEncodeError: 'ascii' codec can't encode character u'\–' in position 3 2: ordinal not in range(128)

解決方案1
81 2011-03-22 07:20:17

解決方案2
29 2015-01-14 17:18:32

解決方案3
7 2015-01-05 08:40:11

解決方案4
5 2017-04-28 18:08:08

解決方案5
2 2019-11-09 15:34:27

解決方案6
0 2017-11-21 10:13:28

解決方案7
0 2019-11-05 11:26:56

解決方案8
0 2021-08-04 07:43:32