UTF-8和彩色打印问题

Question

I have a console program that outputs in wonderful colour. 我有一个控制台程序，可以输出美妙的色彩。 For errors, the following code is used with some trivial examples at the bottom. 对于错误，下面的代码与底部的一些简单示例结合使用。

# coding: utf-8
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from __future__ import unicode_literals
from sys import stderr
from colored import fg
from colored import attr
from locale import getpreferredencoding

def format_error(x):
    return '{0}{1}{2}'.format(fg(88), x, attr('reset'))

def print_error(x):
    msg = format_error('✗  {0}\n'.format(x))
    stderr.write(msg.encode(getpreferredencoding()))

print_error(str('ook'))
print_error(unicode(b'café', 'UTF-8'))

I have no control over that x is. 我无法控制x 。 It could be anything. 可能是任何东西。 Also, some of this script is called from a GUI that captures stdout / stderr via glib-spawn-async . 另外，此脚本中的某些脚本是通过GUI调用的，该GUI通过glib-spawn-async捕获stdout / stderr 。 As such, from time to time, I get UnicodeDecodeError errors. 因此，我有时会收到UnicodeDecodeError错误。 I have read the Unicode HOWTo but clearly I am missing something. 我已经阅读了Unicode HOWTo，但显然我缺少一些东西。

How can I harden my code such that UnicodeDecodeError are never raised? 我如何加强我的代码，使之永远不会引发UnicodeDecodeError ？

For example, within a gtk.textview , I get the following whereas on the console, all is fine. 例如，在gtk.textview ，我得到了以下内容，而在控制台上，一切都很好。 Trace has been cut to remove irrelevant data. 跟踪已被删除，以删除无关的数据。

 File "/home/usr/nifty_logger.py", line 96, in print_success
    sys.stdout.write(msg.encode(getpreferredencoding()))
  File "/home/usr/.virtualenvs/rprs_bootstrap/lib64/python2.7/codecs.py", line 351, in write
    data, consumed = self.encode(object, self.errors)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 0: ordinal not in range(128)

Answer 1

The encode() takes an optional argument defining the error handling: encode（）采用一个可选的参数定义错误处理：

str.encode([encoding[, errors]])

From the docs: 从文档：

Return an encoded version of the string. 返回字符串的编码版本。 Default encoding is the current default string encoding. 默认编码是当前的默认字符串编码。 errors may be given to set a different error handling scheme. 可以设置错误以设置不同的错误处理方案。 The default for errors is 'strict', meaning that encoding errors raise a UnicodeError. 错误的默认值为“严格”，这意味着编码错误会引发UnicodeError。 Other possible values are 'ignore', 'replace', 'xmlcharrefreplace', 'backslashreplace' and any other name registered via codecs.register_error(), see section Codec Base Classes. 其他可能的值是'ignore'，'replace'，'xmlcharrefreplace'，'backslashreplace'以及通过codecs.register_error（）注册的任何其他名称，请参见编解码器基类。 For a list of possible encodings, see section Standard Encodings. 有关可能的编码的列表，请参见“标准编码”部分。

In your case: 在您的情况下：

msg.encode(getpreferredencoding(), 'backslashreplace')

UTF-8和彩色打印问题

问题描述

1 个解决方案

解决方案1
1 已采纳 2016-09-14 07:49:21

UTF-8和彩色打印问题

问题描述

1 个解决方案

解决方案1 1 已采纳 2016-09-14 07:49:21

解决方案1
1 已采纳 2016-09-14 07:49:21