[英]How to solve UnicodeDecodeError in Python 3.6?
I am switched from Python 2.7 to Python 3.6.我从 Python 2.7 切换到 Python 3.6。
I have scripts that deal with some non-English content.我有处理一些非英语内容的脚本。
I usually run scripts via Cron and also in Terminal.我通常通过 Cron 和终端运行脚本。
I had UnicodeDecodeError in my Python 2.7 scripts and I solved by this.我的 Python 2.7 脚本中有 UnicodeDecodeError,我通过这个解决了。
# encoding=utf8
import sys
reload(sys)
sys.setdefaultencoding('utf8')
Now in Python 3.6, it doesnt work.现在在Python 3.6,不行。 I have print statements like
print("Here %s" % (myvar))
and it throws error.我有像
print("Here %s" % (myvar))
这样的打印语句,它会抛出错误。 I can solve this issue by replacing it to myvar.encode("utf-8")
but I don't want to write with each print statement.我可以通过将它替换为
myvar.encode("utf-8")
来解决这个问题,但我不想用每个打印语句来编写。
I did PYTHONIOENCODING=utf-8
in my terminal and I have still that issue.我在我的终端上做了
PYTHONIOENCODING=utf-8
,但我仍然有那个问题。
Is there a cleaner way to solve UnicodeDecodeError
issue in Python 3.6? Python 3.6 中是否有更简洁的方法来解决
UnicodeDecodeError
问题?
is there any way to tell Python3 to print everything in utf-8?有没有办法告诉 Python3 打印 utf-8 中的所有内容? just like I did in Python2?
就像我在 Python2 中所做的那样?
It sounds like your locale is broken and have another bytes->Unicode issue .听起来您的语言环境已损坏并且还有另一个 bytes->Unicode issue 。 The thing you did for Python 2.7 is a hack that only masked the real problem (there's a reason why you have to
reload sys
to make it work).您为 Python 2.7 所做的事情只是掩盖了真正的问题(您必须
reload sys
以使其工作是有原因的)。
To fix your locale, try typing locale
from the command line.要修复您的语言环境,请尝试从命令行输入
locale
。 It should look something like:它应该看起来像:
LANG=en_GB.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_GB.UTF-8"
LC_TIME="en_GB.UTF-8"
LC_COLLATE="en_GB.UTF-8"
LC_MONETARY="en_GB.UTF-8"
LC_MESSAGES="en_GB.UTF-8"
LC_ALL=
locale
depends on LANG
being set properly. locale
取决于LANG
正确设置。 Python effectively uses locale
to work out what encoding to use when writing to stdout in. If it can't work it out, it defaults to ASCII. Python 有效地使用
locale
来确定写入 stdout 时使用的编码。如果无法解决,则默认为 ASCII。
You should first attempt to fix your locale.您应该首先尝试修复您的语言环境。 If
locale
errors, make sure you've installed the correct language pack for your region.如果
locale
错误,请确保您已安装适用于您所在地区的正确语言包。
If all else fails, you can always fix Python by setting PYTHONIOENCODING=UTF-8
.如果所有其他方法都失败了,您始终可以通过设置
PYTHONIOENCODING=UTF-8
来修复 Python。 This should be used as a last resort as you'll be masking problems once again.这应该用作最后的手段,因为您将再次掩盖问题。
If Python is still throwing an error after setting PYTHONIOENCODING
then please update your question with the stacktrace.如果 Python 在设置
PYTHONIOENCODING
后仍然抛出错误,请使用PYTHONIOENCODING
更新您的问题。 Chances are you've got an implied conversion going on.很可能您正在进行隐含转换。
I had this issue when using Python inside a Docker container based on Ubuntu 18.04.我在基于 Ubuntu 18.04 的 Docker 容器中使用 Python 时遇到了这个问题。 It appeared to be a locale issue, which was solved by adding the following to the Dockerfile:
这似乎是一个语言环境问题,可以通过将以下内容添加到 Dockerfile 来解决:
ENV LANG C.UTF-8
For a Python-only solution you will have to recreate your sys.stdout
object:对于仅限 Python 的解决方案,您必须重新创建
sys.stdout
对象:
import sys, codecs
sys.stdout = codecs.getwriter('utf-8')(sys.stdout.detach())
After this, a normal print("hello world")
should be encoded to UTF-8 automatically.在此之后,正常的
print("hello world")
应自动编码为 UTF-8。
But you should try to find out why your terminal is set to such a strange encoding (which Python just tries to adopt to).但是您应该尝试找出为什么您的终端设置为如此奇怪的编码(Python 只是试图采用这种编码)。 Maybe your operating system is configured wrong somehow.
也许您的操作系统以某种方式配置错误。
EDIT: In my tests unsetting the env variable LANG
produced this strange setting for the stdout encoding for me:编辑:在我的测试中,取消设置 env 变量
LANG
为我的 stdout 编码产生了这个奇怪的设置:
LANG= python3
import sys
sys.stdout.encoding
printed 'ANSI_X3.4-1968'
.打印
'ANSI_X3.4-1968'
。
So I guess you might want to set your LANG
to something like en_US.UTF-8
.所以我猜你可能想将你的
LANG
设置为en_US.UTF-8
类的东西。 Your terminal program doesn't seem to do this.您的终端程序似乎没有这样做。
To everyone using pickle to load a file previously saved in python 2 and getting an UnicodeDecodeError, try setting pickle encoding
parameter:对于使用pickle加载以前保存在python 2中的文件并获得UnicodeDecodeError的每个人,请尝试设置pickle
encoding
参数:
with open("./data.pkl", "rb") as data_file:
samples = pickle.load(data_file, encoding='latin1')
for docker with python3.6, use LANG=C.UTF-8 python or jupyter xxx
works for me, thanks to @Daniel and @zhy对于 docker 和 python3.6,使用
LANG=C.UTF-8 python or jupyter xxx
对我有用,感谢@Daniel 和@zhy
Python 3 (including 3.6) is already Unicode supported. Python 3(包括 3.6)已经支持 Unicode。 Here is the doc - https://docs.python.org/3/howto/unicode.html
这是文档 - https://docs.python.org/3/howto/unicode.html
So you don't need to force Unicode support like Python 2.7.所以你不需要像 Python 2.7 那样强制支持 Unicode。 Try to run your code normally.
尝试正常运行您的代码。 If you get any error reading a Unicode text file you need to use the
encoding='utf-8'
parameter while reading the file.如果您在读取 Unicode 文本文件时遇到任何错误,您需要在读取文件时使用
encoding='utf-8'
参数。
I mean you could write an custom function like this: (Not optimal i know)我的意思是你可以写一个这样的自定义函数:(我知道不是最优的)
import sys
def printUTF8(input):
print(input.encode("utf-8"))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.