简体   繁体   English

将 JSON 写入文件导致 Ubuntu Server 16.04LTS 上的 Unicode 错误

[英]Writing JSON to file causing Unicode Error on Ubuntu Server 16.04LTS

I am sure that the problem I am experiencing isn't directly related to the OS or version, but instead to some kind of setup.我确信我遇到的问题与操作系统或版本没有直接关系,而是与某种设置有关。 In a python django app I am writing JSON to a file which can contain characters from other languages like我, ל, and は.在 python django 应用程序中,我正在将 JSON 写入一个文件,该文件可以包含来自其他语言的字符,如我, ל, and は At no point in time in the flow of data am I changing the encoding as far as I am aware.据我所知,在数据流的任何时间点,我都不会更改编码。

During local development, this was not a problem.在本地开发期间,这不是问题。

    with open(self._json_path, 'w') as f:
        json.dump(test_dict, f, indent=2, ensure_ascii=False)
    answer, wordid, question = self._unpack_dict(test_dict)

Once I deployed to the live web server, I began getting:部署到实时 Web 服务器后,我开始获得:

'ascii' codec can't encode character '\都' in position 1: ordinal not in range(128) “ascii”编解码器无法在位置 1 中对字符“\都”进行编码:序号不在范围内(128)

I know for a fact that the data in test_dict is encoded properly.我知道 test_dict 中的数据编码正确。 As soon as the json.dump occurs, it errors.一旦 json.dump 发生,它就会出错。 If I open the file that was created, it fails at the very first non-latin character I put into it.如果我打开创建的文件,它会在我输入的第一个非拉丁字符处失败。

I've been through this post , but couldn't sort out the problem.我已经通过了这篇文章,但无法解决问题。 Adding , encoding='utf8' causes the output of the above code to create a file but put nothing in it.添加, encoding='utf8'会导致上述代码的输出创建一个文件,但没有在其中放置任何内容。 Again, I know for a fact that test_dict has data as the data is displaying on the web page properly.同样,我知道test_dict有数据,因为数据正确显示在网页上。 **Answer to blank file: ** in troubleshooting I switched from dump to dumps which caused the files to be generated but not filled. **对空白文件的回答:** 在故障排除中,我从dump切换到dumps ,这导致文件生成但未填充。 **Answer to main problem: ** encoding='utf8' is not correct, it is encoding='utf-8' **主要问题答案:** encoding='utf8'不正确,是encoding='utf-8'

I've also tried rebuilding the virtual environment.我也试过重建虚拟环境。

On the server here are some results:在服务器上有一些结果:

echo $LANG
en_US.UTF-8
python -c "import sys; print(sys.stdout.encoding)"
UTF-8
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

Local environment:当地环境:

echo $LANG
en_US.UTF-8
python -c "import sys; print(sys.stdout.encoding)"
UTF-8
LANG=en_US.UTF-8
LANGUAGE=en_US
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

The two most notable difference are:两个最显着的区别是:

Server: Python 3.5服务器:Python 3.5

Local environment: Python 3.7.2本地环境:Python 3.7.2

apache2/error.log apache2/error.log

[Fri Feb 15 09:49:15.899753 2019] [wsgi:error] [pid 2489] GETTING NEW TEST
[Fri Feb 15 09:49:15.957897 2019] [wsgi:error] [pid 2489] Saving the test dictionary
[Fri Feb 15 09:49:16.006470 2019] [wsgi:error] [pid 2489] Internal Server Error: /test/addohm/
[Fri Feb 15 09:49:16.006506 2019] [wsgi:error] [pid 2489] Traceback (most recent call last):
[Fri Feb 15 09:49:16.006509 2019] [wsgi:error] [pid 2489]   File "/path/to/website/venv/lib/python3.5/site-packages/django/core/handlers/exception.py", line 34, in in$
[Fri Feb 15 09:49:16.006511 2019] [wsgi:error] [pid 2489]     response = get_response(request)
[Fri Feb 15 09:49:16.006514 2019] [wsgi:error] [pid 2489]   File "/path/to/website/venv/lib/python3.5/site-packages/django/core/handlers/base.py", line 126, in _get_r$
[Fri Feb 15 09:49:16.006517 2019] [wsgi:error] [pid 2489]     response = self.process_exception_by_middleware(e, request)
[Fri Feb 15 09:49:16.006520 2019] [wsgi:error] [pid 2489]   File "/path/to/website/venv/lib/python3.5/site-packages/django/core/handlers/base.py", line 124, in _get_r$
[Fri Feb 15 09:49:16.006522 2019] [wsgi:error] [pid 2489]     response = wrapped_callback(request, *callback_args, **callback_kwargs)
[Fri Feb 15 09:49:16.006525 2019] [wsgi:error] [pid 2489]   File "/path/to/website/duotool/main/views.py", line 125, in test
[Fri Feb 15 09:49:16.006531 2019] [wsgi:error] [pid 2489]     form = TestForm(wordsdict)
[Fri Feb 15 09:49:16.006533 2019] [wsgi:error] [pid 2489]   File "/path/to/website/duotool/main/forms.py", line 21, in __init__
[Fri Feb 15 09:49:16.006536 2019] [wsgi:error] [pid 2489]     json.dumps(test_dict, f, indent=2, ensure_ascii=False)
[Fri Feb 15 09:49:16.006538 2019] [wsgi:error] [pid 2489]   File "/usr/lib/python3.5/json/__init__.py", line 179, in dump
[Fri Feb 15 09:49:16.006541 2019] [wsgi:error] [pid 2489]     fp.write(chunk)
[Fri Feb 15 09:49:16.006545 2019] [wsgi:error] [pid 2489] UnicodeEncodeError: 'ascii' codec can't encode character '\\u7ea6' in position 1: ordinal not in range(128)
[Fri Feb 15 09:49:16.006550 2019] [wsgi:error] [pid 2489]
[Fri Feb 15 09:51:36.924025 2019] [wsgi:error] [pid 2627] [client 124.9.54.252:55825] Timeout when reading response headers from daemon process 'duotool.addohm.net': /var/www/django/du$
[Fri Feb 15 09:51:43.283083 2019] [wsgi:error] [pid 2631] [client 124.9.54.252:55971] Truncated or oversized response headers received from daemon process 'duotool.addohm.net': /var/ww$
[Fri Feb 15 09:51:43.283324 2019] [wsgi:error] [pid 2630] [client 124.9.54.252:55888] Truncated or oversized response headers received from daemon process 'duotool.addohm.net': /var/ww$
[Fri Feb 15 09:53:18.572055 2019] [wsgi:error] [pid 2489] GETTING NEW TEST
[Fri Feb 15 09:53:18.631474 2019] [wsgi:error] [pid 2489] Saving the test dictionary
[Fri Feb 15 09:53:18.675315 2019] [wsgi:error] [pid 2489] Internal Server Error: /test/addohm/
[Fri Feb 15 09:53:18.675335 2019] [wsgi:error] [pid 2489] Traceback (most recent call last):
[Fri Feb 15 09:53:18.675338 2019] [wsgi:error] [pid 2489]   File "/path/to/website/venv/lib/python3.5/site-packages/django/core/handlers/exception.py", line 34, in in$
[Fri Feb 15 09:53:18.675341 2019] [wsgi:error] [pid 2489]     response = get_response(request)
[Fri Feb 15 09:53:18.675344 2019] [wsgi:error] [pid 2489]   File "/path/to/website/venv/lib/python3.5/site-packages/django/core/handlers/base.py", line 126, in _get_r$
[Fri Feb 15 09:53:18.675347 2019] [wsgi:error] [pid 2489]     response = self.process_exception_by_middleware(e, request)
[Fri Feb 15 09:53:18.675349 2019] [wsgi:error] [pid 2489]   File "/path/to/website/venv/lib/python3.5/site-packages/django/core/handlers/base.py", line 124, in _get_r$
[Fri Feb 15 09:53:18.675352 2019] [wsgi:error] [pid 2489]     response = wrapped_callback(request, *callback_args, **callback_kwargs)
[Fri Feb 15 09:53:18.675354 2019] [wsgi:error] [pid 2489]   File "/path/to/website/duotool/main/views.py", line 125, in test
[Fri Feb 15 09:53:18.675357 2019] [wsgi:error] [pid 2489]     form = TestForm(wordsdict)
[Fri Feb 15 09:53:18.675359 2019] [wsgi:error] [pid 2489]   File "/path/to/website/duotool/main/forms.py", line 21, in __init__
[Fri Feb 15 09:53:18.675362 2019] [wsgi:error] [pid 2489]     json.dumps(test_dict, f, indent=2, ensure_ascii=False)
[Fri Feb 15 09:53:18.675365 2019] [wsgi:error] [pid 2489]   File "/usr/lib/python3.5/json/__init__.py", line 179, in dump
[Fri Feb 15 09:53:18.675367 2019] [wsgi:error] [pid 2489]     fp.write(chunk)
[Fri Feb 15 09:53:18.675371 2019] [wsgi:error] [pid 2489] UnicodeEncodeError: 'ascii' codec can't encode character '\\u90fd' in position 1: ordinal not in range(128)
[Fri Feb 15 09:53:18.675376 2019] [wsgi:error] [pid 2489]
[Fri Feb 15 09:53:19.498167 2019] [wsgi:error] [pid 2652] /path/to/website/duotool

I can't see any differences between the two versions that would cause this.我看不出两个版本之间有任何差异会导致这种情况。 Where do I go from here?我从这里去哪里?

I think you should specify the encoding for the file you are writing to when you open it.我认为您应该在打开文件时指定要写入的文件的编码。 This should work:这应该有效:

   file = open(self._json_path, 'w',encoding='utf-8')
   file.write(json.dumps(your_json))
   file.close()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM