简体   繁体   English

Python UTF8 编码 2.7.5 / 3.8.5

[英]Python UTF8 encoding 2.7.5 / 3.8.5

What I try to understand is i'm runing Python 3.8.5 on Windows and Python 2.7.5 on my webserver.我试图了解的是我在 Windows 上运行 Python 3.8.5,在我的网络服务器上运行 Python 2.7.5。

i'm trying to translate from a JSON with a code like this我正在尝试使用这样的代码从 JSON 翻译

hash = ""
try:
    hash = str(translateTable[item["hash"]])
except:
hash = str(item["hash"])

the following code is loading the JSON file以下代码正在加载 JSON 文件

with io.open('translate.json', encoding="utf-8") as fh:
    translateTable = json.load(fh)

JSON FILE {"vunk": "Vunk-Gerät"}

When I run the code on windows with 3.7.5 the result is like it should be当我使用 3.7.5 在 windows 上运行代码时,结果应该是

IN >>> python test.py
OUT>>> Vunk-Gerät

Here comes the tricky part, when I run on my webserver with Python 2.7.5 the result is this棘手的部分来了,当我使用 Python 2.7.5 在我的网络服务器上运行时,结果是这样的

IN >>> python test.py
OUT>>> vunk

The problem is, on the Webserver it can't translate "Ä,Ö,Ü,ß" and I don't get it why?问题是,在网络服务器上它无法翻译“Ä,Ö,Ü,ß”,我不明白为什么?

The most likely problem is that the values loaded from the json object are unicode rather than str .最可能的问题是从 json object 加载的值是unicode而不是str In Python 2 unicode is the equivalent of str in Python 3, and Python 2's str is the equivalent of Python 3's bytes . Python 2中的unicode相当于Python 3中的str ,Python 2的str相当于Python 3的bytes So the problem may be:所以问题可能是:

transtable = {u"vunk": u"Vunk-Gerät"}

str(transtable['vunk'])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 8: ordinal not in range(128)

This happens because Python 2's str tries to encode u"Vunk-Gerät" to ascii, but it cannot (because of the "ä").发生这种情况是因为 Python 2 的str试图将u"Vunk-Gerät"编码为 ascii,但它不能(因为 "ä")。

The simplest solution might be to avoid calling str at all:最简单的解决方案可能是完全避免调用str

hash = ""
try:
    hash = translateTable[item["hash"]]
except Exception as ex:
    hash = item["hash"]

since the keys and values should be usable as they are.因为键和值应该可以按原样使用。

A more robust approach would be to use the six library to handle string and bytes types in a way that works with both Python 2 and Python 3. The ideal solution, as others have pointed out, is to run Python 3 on your server.一种更可靠的方法是使用6 个库以一种与 Python 2 和 Python 3 一起工作的方式处理字符串和字节类型。正如其他人指出的那样,理想的解决方案是在您的服务器上运行 Python 3。 Python 3 is much easier to use when processing non-ASCII text. Python 3 在处理非 ASCII 文本时更容易使用。

For anyone who is facing the same problem as me here is the solution for 2.7.5对于与我面临同样问题的人来说,这里是 2.7.5 的解决方案

from django.utils.six import smart_str, smart_unicode
hash = ""
try:
    hash = smart_str(translateTable[item["hash"]])
except Exception as ex:
    hash = smart_str(item["hash"])

also make sure django is installed还要确保安装了 django

pip install django

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM