简体   繁体   中英

Redirecting python output to a file causes UnicodeEncodeError on Windows

I'm trying to redirect output of python script to a file. When output contains non-ascii characters it works on macOS and Linux, but not on Windows.

I've deduced the problem to a simple test. The following is what is shown in Windows command prompt window. The test is only one print call.

Microsoft Windows [Version 10.0.17134.472]
(c) 2018 Microsoft Corporation. All rights reserved.

D:\>set PY
PYTHONIOENCODING=utf-8

D:\>type pipetest.py
print('\u0422\u0435\u0441\u0442')

D:\>python pipetest.py
Тест

D:\>python pipetest.py > test.txt

D:\>type test.txt
Тест

D:\>type test.txt | iconv -f utf-8 -t utf-8
Тест

D:\>set PYTHONIOENCODING=

D:\>python pipetest.py
Тест

D:\>python pipetest.py > test.txt
Traceback (most recent call last):
  File "pipetest.py", line 1, in <module>
    print('\u0422\u0435\u0441\u0442')
  File "C:\Python\Python37\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-3: character maps to <undefined>

D:\>python -V
Python 3.7.2

As one can see setting PYTHONIOENCODING environment variable helps but I don't understand why it needed to be set. When output is terminal it works but if output is a file it fails. Why does cp1252 is used when stdout is not a console?

Maybe it is a bug and can be fixed in Windows version of python?

I'm trying to redirect output of python script to a file. When output contains non-ascii characters it works on macOS and Linux, but not on Windows.

I've deduced the problem to a simple test. The following is what is shown in Windows command prompt window. The test is only one print call.

Microsoft Windows [Version 10.0.17134.472]
(c) 2018 Microsoft Corporation. All rights reserved.

D:\>set PY
PYTHONIOENCODING=utf-8

D:\>type pipetest.py
print('\u0422\u0435\u0441\u0442')

D:\>python pipetest.py
Тест

D:\>python pipetest.py > test.txt

D:\>type test.txt
Тест

D:\>type test.txt | iconv -f utf-8 -t utf-8
Тест

D:\>set PYTHONIOENCODING=

D:\>python pipetest.py
Тест

D:\>python pipetest.py > test.txt
Traceback (most recent call last):
  File "pipetest.py", line 1, in <module>
    print('\u0422\u0435\u0441\u0442')
  File "C:\Python\Python37\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-3: character maps to <undefined>

D:\>python -V
Python 3.7.2

As one can see setting PYTHONIOENCODING environment variable helps but I don't understand why it needed to be set. When output is terminal it works but if output is a file it fails. Why does cp1252 is used when stdout is not a console?

Maybe it is a bug and can be fixed in Windows version of python?

I'm trying to redirect output of python script to a file. When output contains non-ascii characters it works on macOS and Linux, but not on Windows.

I've deduced the problem to a simple test. The following is what is shown in Windows command prompt window. The test is only one print call.

Microsoft Windows [Version 10.0.17134.472]
(c) 2018 Microsoft Corporation. All rights reserved.

D:\>set PY
PYTHONIOENCODING=utf-8

D:\>type pipetest.py
print('\u0422\u0435\u0441\u0442')

D:\>python pipetest.py
Тест

D:\>python pipetest.py > test.txt

D:\>type test.txt
Тест

D:\>type test.txt | iconv -f utf-8 -t utf-8
Тест

D:\>set PYTHONIOENCODING=

D:\>python pipetest.py
Тест

D:\>python pipetest.py > test.txt
Traceback (most recent call last):
  File "pipetest.py", line 1, in <module>
    print('\u0422\u0435\u0441\u0442')
  File "C:\Python\Python37\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-3: character maps to <undefined>

D:\>python -V
Python 3.7.2

As one can see setting PYTHONIOENCODING environment variable helps but I don't understand why it needed to be set. When output is terminal it works but if output is a file it fails. Why does cp1252 is used when stdout is not a console?

Maybe it is a bug and can be fixed in Windows version of python?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM