简体   繁体   English

将python输出重定向到文件会在Windows上导致UnicodeEncodeError

[英]Redirecting python output to a file causes UnicodeEncodeError on Windows

I'm trying to redirect output of python script to a file.我正在尝试将python脚本的输出重定向到文件。 When output contains non-ascii characters it works on macOS and Linux, but not on Windows.当输出包含非ASCII字符时,它可以在macOS和Linux上运行,但不能在Windows上运行。

I've deduced the problem to a simple test.我把这个问题推导出来了一个简单的测试。 The following is what is shown in Windows command prompt window.以下是Windows命令提示符窗口中显示的内容。 The test is only one print call.该测试仅是一次打印呼叫。

Microsoft Windows [Version 10.0.17134.472]
(c) 2018 Microsoft Corporation. All rights reserved.

D:\>set PY
PYTHONIOENCODING=utf-8

D:\>type pipetest.py
print('\u0422\u0435\u0441\u0442')

D:\>python pipetest.py
Тест

D:\>python pipetest.py > test.txt

D:\>type test.txt
Тест

D:\>type test.txt | iconv -f utf-8 -t utf-8
Тест

D:\>set PYTHONIOENCODING=

D:\>python pipetest.py
Тест

D:\>python pipetest.py > test.txt
Traceback (most recent call last):
  File "pipetest.py", line 1, in <module>
    print('\u0422\u0435\u0441\u0442')
  File "C:\Python\Python37\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-3: character maps to <undefined>

D:\>python -V
Python 3.7.2

As one can see setting PYTHONIOENCODING environment variable helps but I don't understand why it needed to be set.可以看到设置PYTHONIOENCODING环境变量会有所帮助,但我不明白为什么需要设置它。 When output is terminal it works but if output is a file it fails.当输出是终端时,它可以工作,但是如果输出是文件,则失败。 Why does cp1252 is used when stdout is not a console?当stdout不是控制台时,为什么要使用cp1252?

Maybe it is a bug and can be fixed in Windows version of python?也许这是一个错误,可以在Windows版本的python中修复吗?

I'm trying to redirect output of python script to a file.我正在尝试将python脚本的输出重定向到文件。 When output contains non-ascii characters it works on macOS and Linux, but not on Windows.当输出包含非ASCII字符时,它可以在macOS和Linux上运行,但不能在Windows上运行。

I've deduced the problem to a simple test.我把这个问题推导出来了一个简单的测试。 The following is what is shown in Windows command prompt window.以下是Windows命令提示符窗口中显示的内容。 The test is only one print call.该测试仅是一次打印呼叫。

Microsoft Windows [Version 10.0.17134.472]
(c) 2018 Microsoft Corporation. All rights reserved.

D:\>set PY
PYTHONIOENCODING=utf-8

D:\>type pipetest.py
print('\u0422\u0435\u0441\u0442')

D:\>python pipetest.py
Тест

D:\>python pipetest.py > test.txt

D:\>type test.txt
Тест

D:\>type test.txt | iconv -f utf-8 -t utf-8
Тест

D:\>set PYTHONIOENCODING=

D:\>python pipetest.py
Тест

D:\>python pipetest.py > test.txt
Traceback (most recent call last):
  File "pipetest.py", line 1, in <module>
    print('\u0422\u0435\u0441\u0442')
  File "C:\Python\Python37\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-3: character maps to <undefined>

D:\>python -V
Python 3.7.2

As one can see setting PYTHONIOENCODING environment variable helps but I don't understand why it needed to be set.可以看到设置PYTHONIOENCODING环境变量会有所帮助,但我不明白为什么需要设置它。 When output is terminal it works but if output is a file it fails.当输出是终端时,它可以工作,但是如果输出是文件,则失败。 Why does cp1252 is used when stdout is not a console?当stdout不是控制台时,为什么要使用cp1252?

Maybe it is a bug and can be fixed in Windows version of python?也许这是一个错误,可以在Windows版本的python中修复吗?

I'm trying to redirect output of python script to a file.我正在尝试将python脚本的输出重定向到文件。 When output contains non-ascii characters it works on macOS and Linux, but not on Windows.当输出包含非ASCII字符时,它可以在macOS和Linux上运行,但不能在Windows上运行。

I've deduced the problem to a simple test.我把这个问题推导出来了一个简单的测试。 The following is what is shown in Windows command prompt window.以下是Windows命令提示符窗口中显示的内容。 The test is only one print call.该测试仅是一次打印呼叫。

Microsoft Windows [Version 10.0.17134.472]
(c) 2018 Microsoft Corporation. All rights reserved.

D:\>set PY
PYTHONIOENCODING=utf-8

D:\>type pipetest.py
print('\u0422\u0435\u0441\u0442')

D:\>python pipetest.py
Тест

D:\>python pipetest.py > test.txt

D:\>type test.txt
Тест

D:\>type test.txt | iconv -f utf-8 -t utf-8
Тест

D:\>set PYTHONIOENCODING=

D:\>python pipetest.py
Тест

D:\>python pipetest.py > test.txt
Traceback (most recent call last):
  File "pipetest.py", line 1, in <module>
    print('\u0422\u0435\u0441\u0442')
  File "C:\Python\Python37\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-3: character maps to <undefined>

D:\>python -V
Python 3.7.2

As one can see setting PYTHONIOENCODING environment variable helps but I don't understand why it needed to be set.可以看到设置PYTHONIOENCODING环境变量会有所帮助,但我不明白为什么需要设置它。 When output is terminal it works but if output is a file it fails.当输出是终端时,它可以工作,但是如果输出是文件,则失败。 Why does cp1252 is used when stdout is not a console?当stdout不是控制台时,为什么要使用cp1252?

Maybe it is a bug and can be fixed in Windows version of python?也许这是一个错误,可以在Windows版本的python中修复吗?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM