简体   繁体   English

为什么 wx.TextCtrl.SetStyle 对表情符号处理不当?

[英]Why does wx.TextCtrl.SetStyle mishandle emojis?

I'm developing an application using wxPython 4.0.4 in Python 3.7.3, and I've run into a problem when trying to color UTF-8 text in a wx.TextCtrl.我正在 Python 3.7.3 中使用 wxPython 4.0.4 开发应用程序,但在尝试为 wx.TextCtrl 中的 UTF-8 文本着色时遇到了问题。 Basically, it seems that certain characters are counted incorrectly within wxPython despite them being counted correctly in Python.基本上,尽管在 Python 中对某些字符进行了正确计数,但似乎某些字符在 wxPython 中的计数不正确。

I initially thought that it was all multi-byte characters were being miss-counted, however, my example code below shows this is not the case.我最初认为所有多字节字符都被错误计数,但是,我下面的示例代码表明情况并非如此。 It appears to be a problem specifically in the wx.TextCtrl.SetStyle function.这似乎是 wx.TextCtrl.SetStyle function 中的一个问题。

import wx
import wx.richtext as rt
app = wx.App()

test_str1 = '''There are no multibyte characters '''
test_str2 = '''blah ble blah\n'''
test_str3 = '''“these are multibyte quotes” '''
test_str4 = '''more single byte chars!\n'''
test_str5 = '''this comma’s represented by multiple bytes\n'''
test_str6 = '''why do emojis 💨 💨 💨 seem to break TextCtrl.SetStyle 💨 💨 💨\n'''
test_str7 = '''more single byte characters\n'''
test_str8 = '''to demonstrate the issue.'''

def main():
    main = TestFrame()
    main.Show()
    app.MainLoop()

class TestFrame(wx.Frame):
    def __init__(self):
        wx.Frame.__init__(self, None, title="TestFrame")
        sizer = wx.BoxSizer(wx.VERTICAL)
        self.panel = TestPanel(self)
        sizer.Add(self.panel, proportion=1, flag=wx.EXPAND)

class TestPanel(wx.Panel):
    def __init__(self, parent):
        wx.Panel.__init__(self, parent)
        self.text = wx.TextCtrl(self, wx.ID_ANY, style=(wx.TE_MULTILINE|wx.TE_RICH|wx.TE_READONLY))
        self.raw_text = ""
        self.styles = []
        self.AddColorText(test_str1, wx.BLUE)
        self.AddColorText(test_str2, wx.RED)
        self.AddColorText(test_str3, wx.BLUE)
        self.AddColorText(test_str4, wx.RED)
        self.AddColorText(test_str5, wx.BLUE)
        self.AddColorText(test_str6, wx.RED)
        self.AddColorText(test_str7, wx.BLUE)
        self.AddColorText(test_str8, wx.RED)
        self.text.SetValue(self.raw_text)
        for s in self.styles:
            self.text.SetStyle(s[0], s[1], s[2])
        sizer = wx.BoxSizer(wx.VERTICAL)
        sizer.Add(self.text, proportion=1, flag=wx.EXPAND)
        self.SetSizer(sizer)

    def AddColorText(self, text, wx_color):
        start = len(self.raw_text)
        self.raw_text += text
        end = len(self.raw_text)
        self.styles.append([start, end, wx.TextAttr(wx_color)])

if __name__ == "__main__":
    main()

在此处输入图像描述

MS Windows uses UTF-16 internally, and before PEP 393 , CPython Unicode was also 16-bit on Windows because of that. MS Windows 在内部使用 UTF-16,在PEP 393之前,CPython Unicode 在 Windows 上也是 16 位的。 But with PEP 393 CPython can now represent all Unicode code points more cleanly, so that one Unicode code point always has a string length of 1.但是使用 PEP 393 CPython 现在可以更清晰地表示所有 Unicode 代码点,因此一个 Unicode 代码点的字符串长度始终为 1。

MSWin, on the other hand, cannot.另一方面,MSWin 不能。 So wxPython must translate strings into UTF-16 before sending them to the operating system.因此 wxPython 必须在将字符串发送到操作系统之前将其转换为 UTF-16。 For everything in the basic multilingual plane of Unicode, which is most of what you'll encounter, that works out fine, because one Unicode code point becomes one UTF-16 character (two bytes).对于 Unicode 的基本多语言平面中的所有内容,这是您将遇到的大部分内容,这很好,因为一个 Unicode 代码点变成了一个 UTF-16 字符(两个字节)。

But those new Emoji's are not in the BMP, so they become more than two bytes in UTF-16.但是那些新的 Emoji 不在 BMP 中,所以它们在 UTF-16 中超过了两个字节。 And wxPython fails to account for that: If wxPython passes the start and end counters straight through to an underlying Windows function, then they will be off after an Emoji, because the values your are given count Unicode code points, and the values Windows expects are UTF-16 character counts. And wxPython fails to account for that: If wxPython passes the start and end counters straight through to an underlying Windows function, then they will be off after an Emoji, because the values your are given count Unicode code points, and the values Windows expects are UTF-16 字符计数。

You can work around it computing UTF-16 offsets yourself to pass to SetStyle:您可以自己计算 UTF-16 偏移量以传递给 SetStyle:

utf16start = len(self.raw_text[:start].encode('utf-16'))
utf16end = utf16start + len(self.raw_text[start:end].encode('utf-16'))

Arguably this is a bug in wxPython, and you should report it to the wxPython issue tracker .可以说这是 wxPython 中的一个错误,您应该将它报告给wxPython 问题跟踪器

It appears that my issue here was using python standard functions to calculate length instead of the wx library functions.看来我的问题是使用 python 标准函数而不是 wx 库函数来计算长度。 The below code resolves my problem.下面的代码解决了我的问题。

import wx
import wx.richtext as rt
app = wx.App()

test_str1 = '''There are no multibyte characters '''
test_str2 = '''blah ble blah\n'''
test_str3 = '''“these are multibyte quotes” '''
test_str4 = '''more single byte chars!\n'''
test_str5 = '''this comma’s represented by multiple bytes\n'''
test_str6 = '''why do emojis 💨 💨 💨 seem to break TextCtrl.SetStyle 💨 💨 💨\n'''
test_str7 = '''more single byte characters\n'''
test_str8 = '''to demonstrate the issue.'''

def main():
    main = TestFrame()
    main.Show()
    app.MainLoop()

class TestFrame(wx.Frame):
    def __init__(self):
        wx.Frame.__init__(self, None, title="TestFrame")
        sizer = wx.BoxSizer(wx.VERTICAL)
        self.panel = TestPanel(self)
        sizer.Add(self.panel, proportion=1, flag=wx.EXPAND)

class TestPanel(wx.Panel):
    def __init__(self, parent):
        wx.Panel.__init__(self, parent)
        self.text = wx.TextCtrl(self, wx.ID_ANY, style=(wx.TE_MULTILINE|wx.TE_RICH|wx.TE_READONLY))
        self.raw_text = ""
        self.styles = []
        self.AddColorText(test_str1, wx.BLUE)
        self.AddColorText(test_str2, wx.RED)
        self.AddColorText(test_str3, wx.BLUE)
        self.AddColorText(test_str4, wx.RED)
        self.AddColorText(test_str5, wx.BLUE)
        self.AddColorText(test_str6, wx.RED)
        self.AddColorText(test_str7, wx.BLUE)
        self.AddColorText(test_str8, wx.RED)
        for s in self.styles:
            self.text.SetStyle(s[0], s[1], s[2])
        sizer = wx.BoxSizer(wx.VERTICAL)
        sizer.Add(self.text, proportion=1, flag=wx.EXPAND)
        self.SetSizer(sizer)

    def AddColorText(self, text, wx_color):
        start = self.text.GetLastPosition()
        self.text.AppendText(text)
        end = self.text.GetLastPosition()
        self.styles.append([start, end, wx.TextAttr(wx_color)])

if __name__ == "__main__":
    main()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM