简体   繁体   中英

Rise UnicodeEncodeError in logging.StreamHandler

I migrated my python code from Win10 host to WS2012R2. Surprisingly it stops operating correctly and now shows warning message: "UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-2: character maps to "

I've tried to execute a command:

set PYTHONLEGACYWINDOWSSTDIO=yes

My code:

import logging
import sys

def get_console_handler():
    console_handler = logging.StreamHandler(sys.stdout)
    return console_handler


def get_logger():
    logger = logging.getLogger()
    logger.setLevel(logging.DEBUG)
    logger.addHandler(get_console_handler())
    return logger


my_logger = get_logger()
my_logger.debug("Это отладочное сообщение".encode("cp1252"))

What should I do to get rid of this warning?

Update Colleagues, I am sorry for misleading you. I am obviously was tired after long hours of bug tracking ) The problem doesn't connect with "*,encode()" calling as such. it is connected with default python encoding while IO console operation (I suppose so)! The original code makes some requests from DB in cp1251 charset but the problem appears when python is trying to convert it to cp1252.

Here is another example of how to summon the error.

  1. Create a plain text file, ie test.txt with text "Это отладочное сообщение" and save it cp1252.
  2. Run python console and enter: f = open("test.txt") f.read()

Output:

f = open("test.txt")
f.read()
Traceback (most recent call last):   File "<stdin>", line 1, in <module>
File "c:\project\venv\lib\encodings\cp1252.py", line 23, in decode return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 29: character maps to <undefined>

Use encode("utf-8"). Here is a list of python encodings: https://docs.python.org/2.4/lib/standard-encodings.html

my_logger.debug("Это отладочное сообщение".encode("utf-8"))

then use.decode("utf-8") to see the printable value of your string

The problem is how logging.StreamHandler performs console output, namely due to the fact that you couldn't change default encoding in contrast with FileHandler. If the default system encoding doesn't match the needed one, you could face an issue.

For my example. I wanted to output cp1251 lines, while system default encoding was:

import locale
locale.getpreferredencoding()

'cp1252'

This question was solved by changing system locale (see https://stackoverflow.com/a/11234956/9851754 ). Choose "Change system locale..." for non-Unicode programs. No code changes needed.

import locale
locale.getpreferredencoding()

'cp1251'

I have tested your code with Python 3.6.8 and it worked for me (I didn't change anything).

Python 3.6.8:

>>> python3 -V
Python 3.6.8
>>> python3 test.py 
Это отладочное сообщение

But when I have tested it with Python 2.7.15+ , I got a similar error than you.

Python 2.7.15+ with your implementation:

>>> python2 -V
Python 2.7.15+
>>> python2 test.py 
  File "test.py", line 17
SyntaxError: Non-ASCII character '\xd0' in file test.py on line 17, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details

Then I have put the following line into the first line it worked for me.

Begging of code:

# -*- coding: utf-8 -*-
import logging
import sys
...

Output with Python 2.7.15+ and with modified code:

>>> python2 -V
Python 2.7.15+
>>> python2 test.py 
Это отладочное сообщение

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM