简体   繁体   English

有可能在python 2中引发包含非英文字符的异常吗?

[英]possible to raise exception that includes non-english characters in python 2?

I'm trying to raise exception in python 2.7.x which includes a unicode in the message. 我试图在python 2.7.x中引发异常,其中包含消息中的unicode。 I can't seem to make it work. 我似乎无法使其发挥作用。

Is it not supported or not recommended to include unicode in error msg? 是否不支持或不建议在错误消息中包含unicode? Or do i need to be looking at sys.stderr? 或者我需要查看sys.stderr?

 # -*- coding: utf-8 -*-
 class MyException(Exception):
  def __init__(self, value):
    self.value = value
  def __str__(self):
    return self.value
  def __repr__(self):
    return self.value
  def __unicode__(self):
    return self.value

desc = u'something bad with field \u4443'

try:
  raise MyException(desc)
except MyException as e:
  print(u'Inside try block : ' + unicode(e))

# here is what i wish to make work 
raise MyException(desc)

Running script produces the output below. 运行脚本会生成下面的输出。 Inside my try/except i can print the string without problem. 在我的尝试/除了我可以打印字符串没有问题。

My problem is outside the try/except. 我的问题是在try / except之外。

Inside try block : something bad with field 䑃
Traceback (most recent call last):
  File "C:\Python27\lib\bdb.py", line 387, in run
    exec cmd in globals, locals
  File "C:\Users\ghis3080\r.py", line 25, in <module>
    raise MyException(desc)
MyException: something bad with field \u4443

Thanks in advance. 提前致谢。

This is how Python works. 这就是Python的工作原理。 I believe what you are seeing is coming from traceback._some_string() in the Python core library. 我相信你所看到的是来自Python核心库中的traceback._some_string() In that module, when a stack trace is done, the code in that method first tries to convert the message using str() , then if that raises an exception, converts the message using unicode() , then converts it to ascii using encode("ascii", "backslashreplace") . 在该模块中,当完成堆栈跟踪时,该方法中的代码首先尝试使用str()转换消息,然后如果引发异常,则使用unicode()转换消息,然后使用encode("ascii", "backslashreplace")将其转换为ascii encode("ascii", "backslashreplace") You are getting valid output, and everything is working correctly, my guess is that Python is doing it's best to pseudo-down convert the error message so that it will display without problems no matter the platform executing it. 您正在获得有效的输出,并且一切正常,我的猜测是Python正在做最好的伪下转换错误消息,以便无论平台执行它都会显示没有问题。 That is just the unicode codepoint for your character. 这只是你角色的unicode代码点。 It doesn't happen in your try/except block because this conversion is something specific to the mechanism that produces stack traces (such as in the event of uncaught exceptions). try/except块中不会发生这种情况,因为这种转换特定于产生堆栈跟踪的机制(例如在未捕获的异常情况下)。

The behaviour depends on Python version and the environment. 行为取决于Python版本和环境。 On Python 3 the character encoding error handler for sys.stderr is always 'backslashreplace' : 在Python 3上, sys.stderr的字符编码错误处理程序始终是'backslashreplace'

from __future__ import unicode_literals, print_function
import sys

s = 'unicode "\u2323" smile'
print(s)
print(s, file=sys.stderr)
try:
    raise RuntimeError(s)
except Exception as e:
    print(e.args[0])
    print(e.args[0], file=sys.stderr)
    raise

python3: python3:

$ PYTHONIOENCODING=ascii:ignore python3 raise_unicode.py
unicode "" smile
unicode "\u2323" smile
unicode "" smile
unicode "\u2323" smile
Traceback (most recent call last):
  File "raise_unicode.py", line 8, in <module>
    raise RuntimeError(s)
RuntimeError: unicode "\u2323" smile

python2 : python2

$ PYTHONIOENCODING=ascii:ignore python2 raise_unicode.py
unicode "" smile
unicode "" smile
unicode "" smile
unicode "" smile
Traceback (most recent call last):
  File "raise_unicode.py", line 8, in <module>
    raise RuntimeError(s)
RuntimeError

That is on my system the error message is eaten on python2. 这是在我的系统上错误消息在python2上吃掉。

Note: on Windows you could try: 注意:在Windows上,您可以尝试:

T:\> set PYTHONIOENCODING=ascii:ignore
T:\> python raise_unicode.py

For comparison: 为了比较:

$ python3 raise_unicode.py
unicode "⌣" smile
unicode "⌣" smile
unicode "⌣" smile
unicode "⌣" smile
Traceback (most recent call last):
  File "raise_unicode.py", line 8, in <module>
    raise RuntimeError(s)
RuntimeError: unicode "⌣" smile

In my case your example worked as it should, printing nice unicode. 在我的情况下,你的例子工作正常,打印漂亮的unicode。

But sometimes you have a lot of problems with exception stack printed without (or with escaped/backslashed) unicode characters. 但有时你会遇到很多问题,如果没有(或使用转义/反向转换)unicode字符打印的异常堆栈。 It is possible to overcome the obstacle and print normal messages. 可以克服障碍并打印正常消息。

Example of the problem with output (Python 2.7, linux): 输出问题示例(Python 2.7,linux):

# -*- coding: utf-8 -*-
desc = u'something bad with field ¾'
raise SyntaxError(desc.encode('utf-8', 'replace'))

It will print only truncated or screwed message: 它只打印截断或拧紧的消息:

~/.../sources/C_patch$ python SO.py 
Traceback (most recent call last):
  File "SO.py", line 25, in <module>
    raise SyntaxError(desc)
SyntaxError

To actually see the unaltered unicode, you can encode it to raw bytes and feed into exception object: 要实际查看未更改的unicode,可以将其编码为原始字节并将其提供给异常对象:

# -*- coding: utf-8 -*-
desc = u'something bad with field ¾'
raise SyntaxError(desc.encode('utf-8', 'replace'))

This time you will see the full message: 这次你会看到完整的信息:

~/.../sources/C_patch$ python SO.py 
Traceback (most recent call last):
  File "SO.py", line 3, in <module>
    raise SyntaxError(desc.encode('utf-8', 'replace'))
SyntaxError: something bad with field ¾

You can do value.encode('utf-8', 'replace') in your constructor, if you like, but with system exception you will have to do it in the raise statement, like in the example. 如果你愿意,你可以在构造函数中执行value.encode('utf-8', 'replace') ,但是如果你有系统异常,则必须在raise语句中执行它,就像在示例中一样。

The hint is taken from here: Overcoming frustration: Correctly using unicode in python2 (there are big library with many helpers, and all of them can be stripped down to the example above). 提示取自这里: 克服沮丧:在python2中正确使用unicode (有很多帮助器的大型库,所有这些都可以被剥离到上面的例子)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM