UnicodeDecodeError with the sys.stdout inside traceback.print_exc()

Question

I am getting UnicodeDecodeError with the traceback.print_exc(file=sys.stdout) . I am using Python3.4 and did not get the problem with Python2.7.

Am I missing something here? How can I make sure that sys.stdout passes the correct encoded/decoded to the traceback.print_exc() ?

My code looks something similar to this:

try:
    # do something which might throw an exception
except Exception as e:
    # do something
    traceback.print_exc(file=sys.stdout) # Here I am getting the error

Error log:

  traceback.print_exc(file=sys.stdout)
  File "C:\Python34\lib\traceback.py", line 252, in print_exc
    print_exception(*sys.exc_info(), limit=limit, file=file, chain=chain)
  File "C:\Python34\lib\traceback.py", line 169, in print_exception
    for line in _format_exception_iter(etype, value, tb, limit, chain):
  File "C:\Python34\lib\traceback.py", line 153, in _format_exception_iter
    yield from _format_list_iter(_extract_tb_iter(tb, limit=limit))
  File "C:\Python34\lib\traceback.py", line 18, in _format_list_iter
    for filename, lineno, name, line in extracted_list:
  File "C:\Python34\lib\traceback.py", line 65, in _extract_tb_or_stack_iter
    line = linecache.getline(filename, lineno, f.f_globals)
  File "C:\Python34\lib\linecache.py", line 15, in getline
    lines = getlines(filename, module_globals)
  File "C:\Python34\lib\linecache.py", line 41, in getlines
    return updatecache(filename, module_globals)
  File "C:\Python34\lib\linecache.py", line 127, in updatecache
    lines = fp.readlines()
  File "C:\Python34\lib\codecs.py", line 313, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe7 in position 5213: invalid continuation byte

Answer 1

The traceback module wants to include source code lines with the traceback. Normally, a traceback consists only of pointers to source code, not the source code itself, as Python has been executing the compiled bytecode. In the bytecode are hints as to exactly what source code line the bytecode came from.

To then show the sourcecode, the actual source is read from disk, using the linecache module. This also means that Python has to determine the encoding for those files too. The default encoding for a Python 3 source file is UTF-8, but you can set a PEP 263 comment to let Python know if you are deviating from that.

Because source code is read after the code is already loaded and a traceback took place, it is possible that you changed the source code after starting the script, or there was a byte cache file (in a __pycache__ subdirectory) that appeared to be fresh but was no longer matching your source files.

Either way, when you started the script, Python was able to re-use a bytecache file or read the source code just fine and run your code. But when the traceback was being printed, at least one of the source code files was no longer decodable as UTF-8.

If you can reliably reproduce the traceback (so start the Python script again without encoding problems but printing the traceback fails), it is most likely a stale bytecode file somewhere, one that could even hold pointers to a filename that now contains nothing but binary data, not plain source.

If you know how to use the pdb module , add a pdb.set_trace() call before the traceback.print_exc() call and trace what filenames are being loaded from by the linecache module.

Otherwise edit your C:\\Python34\\lib\\traceback.py file and insert a print('Loading {} from the linecache'.format(filename)) statement just before the linecache.checkcache(filename) line in the _extract_tb_or_stack_iter function.

UnicodeDecodeError with the sys.stdout inside traceback.print_exc()

Question

1 answers

solution1
1 ACCPTED 2014-07-15 07:10:13

UnicodeDecodeError with the sys.stdout inside traceback.print_exc()

Question

1 answers

solution1 1 ACCPTED 2014-07-15 07:10:13

solution1
1 ACCPTED 2014-07-15 07:10:13