简体   繁体   English

使用非 BMP 字符引发错误会重新启动 shell

[英]Raising error with non-BMP characters restarts shell

I'm writing a python module designed to work with displaying and entering Emoji in pygame.我正在编写一个 python 模块,用于在 pygame 中显示和输入表情符号。 This means I'm often working with non-BMP Unicode characters with apparently the python shell doesn't like.这意味着我经常使用非 BMP Unicode 字符,显然 python shell 不喜欢。

I've made a custom string-like object to make dealing with emoji characters and sequences easier by storing emoji sequences as a single character.我制作了一个自定义的类似字符串的对象,通过将表情符号序列存储为单个字符来更轻松地处理表情符号字符和序列。 However, although I'd like for str(self) to return the object's raw Unicode representation, this causes problems when attempting to print out or, even worse, when it's included in an error message.然而,虽然我希望 str(self) 返回对象的原始 Unicode 表示,但这会在尝试打印时导致问题,或者更糟糕的是,当它包含在错误消息中时。

This is an example of what happens when a non-BMP character is included in the error message.这是错误消息中包含非 BMP 字符时发生的情况的示例。 Running Python 3.7.3 on Windows 10.在 Windows 10 上运行 Python 3.7.3。

>>> raise ValueError('Beware the non-BMP! \U0001f603')
Traceback (most recent call last):
  File "<pyshell#0>", line 1, in <module>
    raise ValueError('Beware the non-BMP! \U0001f603')
Traceback (most recent call last):
  File "<pyshell#0>", line 1, in <module>
    raise ValueError('Beware the non-BMP! \U0001f603')
Traceback (most recent call last):
  File "D:\Python37\lib\idlelib\run.py", line 474, in runcode
    exec(code, self.locals)
  File "<pyshell#0>", line 1, in <module>
Traceback (most recent call last):
  File "D:\Python37\lib\idlelib\run.py", line 474, in runcode
    exec(code, self.locals)
  File "<pyshell#0>", line 1, in <module>
ValueError: 

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\Python37\lib\idlelib\run.py", line 144, in main
    ret = method(*args, **kwargs)
  File "D:\Python37\lib\idlelib\run.py", line 486, in runcode
    print_exception()
  File "D:\Python37\lib\idlelib\run.py", line 234, in print_exception
    print_exc(typ, val, tb)
  File "D:\Python37\lib\idlelib\run.py", line 232, in print_exc
    print(line, end='', file=efile)
  File "D:\Python37\lib\idlelib\run.py", line 362, in write
    return self.shell.write(s, self.tags)
  File "D:\Python37\lib\idlelib\rpc.py", line 608, in __call__
    value = self.sockio.remotecall(self.oid, self.name, args, kwargs)
  File "D:\Python37\lib\idlelib\rpc.py", line 220, in remotecall
    return self.asyncreturn(seq)
  File "D:\Python37\lib\idlelib\rpc.py", line 251, in asyncreturn
    return self.decoderesponse(response)
  File "D:\Python37\lib\idlelib\rpc.py", line 271, in decoderesponse
    raise what
UnicodeEncodeError: 'UCS-2' codec can't encode characters in position 32-32: Non-BMP character not supported in Tk

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\Python37\lib\idlelib\run.py", line 158, in main
    print_exception()
  File "D:\Python37\lib\idlelib\run.py", line 234, in print_exception
    print_exc(typ, val, tb)
  File "D:\Python37\lib\idlelib\run.py", line 220, in print_exc
    print_exc(type(context), context, context.__traceback__)
  File "D:\Python37\lib\idlelib\run.py", line 232, in print_exc
    print(line, end='', file=efile)
  File "D:\Python37\lib\idlelib\run.py", line 362, in write
    return self.shell.write(s, self.tags)
  File "D:\Python37\lib\idlelib\rpc.py", line 608, in __call__
    value = self.sockio.remotecall(self.oid, self.name, args, kwargs)
  File "D:\Python37\lib\idlelib\rpc.py", line 220, in remotecall
    return self.asyncreturn(seq)
  File "D:\Python37\lib\idlelib\rpc.py", line 251, in asyncreturn
    return self.decoderesponse(response)
  File "D:\Python37\lib\idlelib\rpc.py", line 271, in decoderesponse
    raise what
UnicodeEncodeError: 'UCS-2' codec can't encode characters in position 32-32: Non-BMP character not supported in Tk

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "D:\Python37\lib\idlelib\run.py", line 162, in main
    traceback.print_exception(type, value, tb, file=sys.__stderr__)
  File "D:\Python37\lib\traceback.py", line 105, in print_exception
    print(line, file=file, end="")
  File "D:\Python37\lib\idlelib\run.py", line 362, in write
    return self.shell.write(s, self.tags)
  File "D:\Python37\lib\idlelib\rpc.py", line 608, in __call__
    value = self.sockio.remotecall(self.oid, self.name, args, kwargs)
  File "D:\Python37\lib\idlelib\rpc.py", line 220, in remotecall
    return self.asyncreturn(seq)
  File "D:\Python37\lib\idlelib\rpc.py", line 251, in asyncreturn
    return self.decoderesponse(response)
  File "D:\Python37\lib\idlelib\rpc.py", line 271, in decoderesponse
    raise what
UnicodeEncodeError: 'UCS-2' codec can't encode characters in position 32-32: Non-BMP character not supported in Tk

=============================== RESTART: Shell ===============================

As you can see, it looks like the shell gets into an infinite loop trying to deal with the error, then restarts the shell to prevent getting stuck.如您所见,shell 似乎进入了一个无限循环以尝试处理错误,然后重新启动 shell 以防止卡住。 Is there any way I could a) make str work differently for the error handler or b) prevent the shell restart so the error displays properly?有什么办法可以 a) 使str对错误处理程序的工作方式不同,或者 b) 防止 shell 重新启动以便正确显示错误?

Taking ideas from snakecharmerb and these two questions, I've implemented some code that checks whether the module is being run in the IDLE and if so, whether the function is being called by the error handler.从snakecharmerb 和两个问题中获取想法,我已经实现了一些代码来检查模块是否正在IDLE 中运行,如果是,则该函数是否正在被错误处理程序调用。 Tests appear to be working fine.测试似乎工作正常。 I've got the following checking for an IDLE running environment我有以下检查空闲运行环境

IN_IDLE = False
for item in ['idlelib.__main__','idlelib.run','idlelib']:
    IN_IDLE = IN_IDLE or item in sys.modules

And below is the new __str__ function下面是新的__str__函数

    def __str__(self):
        """ Return str(self). """
        if IN_IDLE:
            # Check for caller. If string is being printed, modify
            # output to be IDLE-friendly (no non-BMP characters)
            callername = sys._getframe(1).f_code.co_name
            if callername == '_some_str':
                rstr = ''
                for char in self.__raw:
                    if ord(char) > 0xFFFF:
                        rstr += '\\U'+hex(ord(char))[2:].zfill(8)
                    else:
                        rstr += repr(char)[1:-1]
                return rstr
            else:
                return self.__raw
        else:
            return self.__raw

Where self.__raw holds the raw text representation of the object.其中self.__raw保存对象的原始文本表示。 I'm caching it to improve efficiency since the objects are intended to be immutable.我缓存它是为了提高效率,因为对象是不可变的。

Of course, while this does work around the issue, I feel like python shouldn't do an entire shell restart when this occurs.当然,虽然这确实解决了这个问题,但我觉得 python 不应该在发生这种情况时重新启动整个 shell。 Will post on bugs.python.org将在 bugs.python.org 上发布

EDIT: Posted on bugs.python.org as issue 36698编辑:作为问题 36698发布在bugs.python.org 上

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM