简体   繁体   English

pprint: UnicodeEncodeError: 'ascii' 编解码器无法编码字符

[英]pprint: UnicodeEncodeError: 'ascii' codec can't encode character

This is driving me crazy.这让我发疯。 I'm trying to pprint a dict with an é char, and it throws me out.我试图用é char pprint一个dict ,它把我扔出去了。

I'm using Python 3:我正在使用 Python 3:

    from pprint import pprint
    knights = {'gallahad': 'the pure', 'robin': 'the bravé'}
    pprint (knights)

Error:错误:

File "/data/prod_envs/pythons/python36/lib/python3.6/pprint.py", line 176, in _format
stream.write(rep)
UnicodeEncodeError: 'ascii' codec can't encode character '\xe9' in position 43: ordinal not in range(128)

I read up on the Python ASCII doc, but there does not seem a quick way to solve this, other than taking the dict apart, and rewriting the offending value to an ASCII value via .encode , and then re-assembling the dict again我阅读了 Python ASCII 文档,但似乎没有一种快速的方法可以解决这个问题,除了将 dict 拆开,然后通过.encode将违规值重写为 ASCII 值,然后再次重新组装 dict

Is there any way I can get this to print without taking the dict apart?有什么办法可以在不拆开字典的情况下打印出来?

This is unrelated to pprint : the module only formats the string into another string and then passes the formatted string to the underlying stream.这是无关pprint :模块仅格式化字符串转换成另一个字符串,然后经过格式化的字符串的基础流。 So your error occurs when the é character (U+00E9) is written to stdout.因此,当é字符 (U+00E9) 写入 stdout 时,会发生您的错误。

Now it really depends on the underlying OS and the configuration of the Python interpreter.现在它真的取决于底层操作系统和 Python 解释器的配置。 In Linux or other Unix-like systems, you could try to declare a UTF-8 or Latin1 charset in your terminal session by setting the environment variable PYTHONIOENCODING before starting Python:在 Linux 或其他类 Unix 系统中,您可以尝试通过在启动 Python 之前设置环境变量PYTHONIOENCODING在终端会话中声明 UTF-8 或 Latin1 字符集:

$ export PYTHONIOENCODING=Latin1
$ python

(or use PYTHONIOENCODING=utf8 depending on the actual encoding of your terminal or terminal window). (或根据终端或终端窗口的实际编码使用PYTHONIOENCODING=utf8 )。

Standard input and output are file objects in Python.标准输入和输出是 Python 中的文件对象。 The Python 3 documentation says that, when these objects are created, if encoding is left unspecified then locale.getpreferredencoding(False) is called to fetch the locale's preferred encoding. Python 3 文档说,当创建这些对象时,如果未指定encodinglocale.getpreferredencoding(False)以获取区域设置的首选编码。

Your system should have been set up with one or more "locales" when GNU/Linux was installed (I'm guessing from your paths that you are using some version of GNU/Linux).在安装 GNU/Linux 时,您的系统应该已经设置了一个或多个“语言环境”(我从您的路径中猜测您正在使用某个版本的 GNU/Linux)。 On a "sensible" setup, the default locale should allow UTF-8.在“合理”设置中,默认语言环境应允许使用 UTF-8。 But if you only did a "minimal" installation (for example as part of setting up a container), or something like that, then it is possible that the system has set locale to "C" (the ultimate fallback locale), which does not support UTF-8.但是,如果您只进行了“最小”安装(例如作为设置容器的一部分)或类似的东西,那么系统可能已将语言环境设置为"C" (最终的后备语言环境),这确实不支持UTF-8。

Just because your terminal can accept UTF-8 (as demonstrated by using echo with a UTF-8 string), does not mean Python knows that UTF-8 is acceptable.仅仅因为您的终端可以接受 UTF-8(如使用echo和 UTF-8 字符串所示),并不意味着Python 知道UTF-8 是可以接受的。 If Python sees the locale set to "C" then it will assume only ASCII is allowed unless told otherwise.如果 Python 看到语言环境设置为"C"则除非另有说明,否则它将假定只允许使用 ASCII。

You can check the current locale by typing locale at the shell prompt, and change it by setting the LC_ALL environment variable.您可以通过在 shell 提示符下键入locale来检查当前区域locale ,并通过设置LC_ALL环境变量来更改它。 But before changing it you must check with locale -a to see which locales are available on your system, otherwise your change may not be effective and you may get the "C" locale anyway.但是在更改它之前,您必须检查locale -a以查看您的系统上可用的语言环境,否则您的更改可能无效并且无论如何您可能会获得"C"语言环境。 If your system has not been set up with the locale you want, you can add it if you have root access: most GNU/Linux distributions provide options to do this when you (re)configure a package called locales , so for example on Debian/Ubuntu-based distros, sudo dpkg-reconfigure locales should show you the options.如果您的系统没有设置您想要的语言环境,如果您有 root 访问权限,您可以添加它:大多数 GNU/Linux 发行版在您(重新)配置名为locales的包时提供了执行此操作的选项,例如在 Debian 上/Ubuntu-based 发行版, sudo dpkg-reconfigure locales应该会显示选项。

But sometimes you will be in the awkward position of having to write a Python script to run on a system that has not been set up with decent locales and there's nothing you can do about it because you don't have root and the sysadmin insists on giving you the absolute minimum.但有时你会在无需编写Python脚本尚未建立体面的语言环境的系统上运行的尴尬境地并没有什么可以做,因为你没有root和系统管理员坚持给你绝对的最小值。 Then what do we do?我们怎么办?

Well there are options within Python itself.好吧,Python 本身就有选项。 You could run export PYTHONIOENCODING=utf-8 before running Python, to tell Python to use that encoding no matter what the locale says.您可以在运行 Python 之前运行export PYTHONIOENCODING=utf-8 ,以告诉 Python 使用该编码,无论语言环境如何。 Or you could give pprint a stream= parameter, set to a stream that you've opened yourself using open() with an encoding="utf-8" parameter (although this is no good if you want to use sys.stdout or os.popen instead of a file).或者你可以给pprint一个stream=参数,设置为你自己使用open()encoding="utf-8"参数open()的流(尽管如果你想使用sys.stdoutos.popen而不是文件)。 Or you could upgrade to Python 3.7 and use sys.stdout.reconfigure(encoding='utf-8') (but this won't work in the Python 3.6 mentioned in the original question).或者您可以升级到 Python 3.7 并使用sys.stdout.reconfigure(encoding='utf-8') (但这在原始问题中提到的 Python 3.6 中不起作用)。

Or, you could import codecs and do w=codecs.getwriter("utf-8")(sys.stdout.buffer) and then pass stream=w to your pprint :或者,您可以import codecs并执行w=codecs.getwriter("utf-8")(sys.stdout.buffer)然后将stream=w传递给您的pprint

from pprint import pprint
import sys, codecs
w=codecs.getwriter("utf-8")(sys.stdout.buffer)
d = {"testing": "这是个考验"}
pprint (d, stream=w)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 UnicodeEncodeError:'ascii'编解码器不能编码字符[...] - UnicodeEncodeError: 'ascii' codec can't encode character […] UnicodeEncodeError:'ascii'编解码器无法编码字符u'\\ xa3' - UnicodeEncodeError: 'ascii' codec can't encode character u'\xa3' UnicodeEncodeError:'ascii'编解码器不能编码字符u'\\ xe9' - UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' UnicodeEncodeError:'ascii'编解码器无法编码字符错误 - UnicodeEncodeError: 'ascii' codec can't encode character error UnicodeEncodeError: 'ascii' 编解码器无法编码字符 '\’' - UnicodeEncodeError: 'ascii' codec can't encode character '\u2019' UnicodeEncodeError:'ascii'编解码器不能编码字符u'\\ xe4' - UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' Python3中的“ UnicodeEncodeError:'ascii'编解码器无法编码字符” - “UnicodeEncodeError: 'ascii' codec can't encode character” in Python3 UnicodeEncodeError:'ascii'编解码器不能编码字符u'\\ xef' - UnicodeEncodeError: 'ascii' codec can't encode character u'\xef' 收到UnicodeEncodeError的Python脚本:“ ascii”编解码器无法编码字符 - Python script receiving a UnicodeEncodeError: 'ascii' codec can't encode character UnicodeEncodeError: 'ascii' 编解码器无法在打印功能中编码字符 - UnicodeEncodeError: 'ascii' codec can't encode character in print function
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM