简体   繁体   English

如何在 tkinter 文本框中使用带有 html 标签的文本,或更改它以使其在 tkinter ZD304BA20E96D873411 中工作?

[英]How can I use text with html tags in a tkinter text box, or change it so that it works in a tkinter label?

I've been given a lot of text and asked to display it in a tkinter app.我收到了很多文本,并被要求在 tkinter 应用程序中显示它。 The text has a lot of html tags like <em> ... <\em> , and <sup> ... <\sup> where the text needs to be italicized or superscript.文本有很多 html 标签,如<em> ... <\em><sup> ... <\sup> ,其中文本需要斜体或上标。

Is there any way built into tkinter to do this? tkinter 中是否有内置方法可以做到这一点? If not, is it even possible to write a function to, for example, italicize all text between <em> tags, then delete the tags?如果没有,是否甚至可以编写一个 function 来例如将<em>标签之间的所有文本斜体,然后删除标签?

I know I would be able to remove the tags by doing something like:我知道我可以通过执行以下操作来删除标签:

for tag in ["<em>", "<\em>", "<sup>", "<\sup>"]:
    text = "".join(text.split(tag))

But I really need to, at least, italicize the text between <em> tags before removing them.但我真的需要,至少,在删除它们之前,将<em>标签之间的文本斜体。

I'm new to tkinter, and I've been watching a lot of tutorials and googling for solutions, but it seems like tkinter can't naturally use html tags, and I can't find any solution.我是 tkinter 的新手,我一直在看很多教程和谷歌搜索解决方案,但似乎 tkinter 不能自然地使用 html 标签,我可以找到任何解决方案。

EDIT:编辑:

I need to display this in a regular tkinter text widget.我需要在常规的 tkinter 文本小部件中显示它。

I know I can use tkinter's font method with slant=italic to set text in a text box to italic.我知道我可以使用 tkinter 的font方法和slant=italic将文本框中的文本设置为斜体。 I just need to know a way to set the parameters to everything between <em> tags.我只需要知道一种将参数设置为<em>标记之间的所有内容的方法。

So, I worked this out myself over the last few days.所以,我在过去的几天里自己解决了这个问题。 First you have find the places in the text that you want to italicize, removing the html tags from the text as you go along, next you have to put the tag-free text into a text widget, then you have to identify the points in the widget's text to italicize.首先,您在文本中找到要斜体的位置,从文本中删除 html 标记,就像 go 一样,接下来您必须将无标记文本放入文本小部件中,然后您必须识别小部件的文本以斜体显示。

It's a bit finicky because identifying points in the text-widget's text requires a decimal input where the number before the decimal point represents the line number, and the number after the decimal represents the index of the character in that line.这有点挑剔,因为识别文本小部件文本中的点需要十进制输入,其中小数点前的数字表示行号,小数点后的数字表示该行中字符的索引。 This means you need to identify line numbers for each index, so you need a way of knowing exactly where one line ends and another begins.这意味着您需要识别每个索引的行号,因此您需要一种方法来准确地知道一行结束和另一行开始的位置。 Also, line 2, character 4 is 2.4 , and line 2, character 40 is 2.40 so Float(f"{line_number}.{character_number}") won't work as it will remove any trailing zeros, you have to use Decimal(f"{line_number}.{character_number}") .此外,第 2 行字符 4 是2.4 ,第 2 行字符 40 是2.40所以Float(f"{line_number}.{character_number}")将不起作用,因为它会删除任何尾随零,你必须使用Decimal(f"{line_number}.{character_number}")

For example, in the text alphabet = 'abcd efg hijk\nlmnop qrs tuv wx yz' , if you want to italicize all of the letters from "h" to "p" you first have to get an index for "h" to start italicizing at, start = alpha.find("h") , then after p to stop italicizing at, end = alphabet.find("p") + 1 .例如,在文本alphabet = 'abcd efg hijk\nlmnop qrs tuv wx yz'中,如果要将“h”到“p”的所有字母都斜体,首先必须获得“h”的索引才能开始斜体在start = alpha.find("h") ,然后在 p 之后停止斜体end = alphabet.find("p") + 1 Next you have to find which line the start point and end point are on and translate the indices (9 and 19 respectively) to decimal format (1.9 and 2.5):接下来,您必须找到起点和终点在哪一行,并将索引(分别为 9 和 19)转换为十进制格式(1.9 和 2.5):

start_line = alphabet[:start].count("\n") + 1
end_line = alphabet[:end].count("\n") + 1
line_start_point = len(alphabet[alphabet[:start].rfind("\n") + 1: start])
line_end_point = len(alphabet[alphabet[:end].rfind("\n") + 1: end])
start_point = Decimal(f"{start_line}.{line_start_point}")
end_point = Decimal(f"{end_line}.{line_end_point}")

Anyway, here's all of the code I ended up using to remove the unnecessary <sup>...</sup> tags and anything between them, and to italicize the everything between <em>...</em> tags:无论如何,这是我最终用来删除不必要的<sup>...</sup>标记和它们之间的任何内容,并将<em>...</em>标记之间的所有内容斜体的所有代码:

from decimal import Decimal
from tkinter import *
from tkinter import font

def em_points(text):
    suppat = re.compile(r'<sup>\w*</sup>')
    suppatiter = suppat.findall(text)
    if suppatiter:
        for suptag in suppatiter:
            text = "".join(text.split(suptag))
    finds = list()
    if "<em>" in text:
        find_points = list()
        emcount = text.count("<em>")
        for _ in range(emcount):
            find_open = text.find("<em>")
            text = text[:find_open] + text[find_open + 4:]
            find_close = text.find("</em>")
            text = text[:find_close] + text[find_close + 5:]
            find_points.append([find_open, find_close])
        for points in find_points:
            finds.append(text[points[0]: points[1]])
    return [text, finds]

def italicize_text(text_box, finds):
    italics_font = font.Font(text_box, text_box.cget("font"))
    italics_font.configure(slant="italic")
    text_box.tag_configure("italics", font=italics_font)
    text_in_box = text_box.get(1.0, END)
    used_points = list()
    for find in finds:
        if find not in text_in_box:
            raise RuntimeError(f"Could not find text to italicise in textbox:\n    {find}\n    {text_in_box}")
        else:
            start_point = text_in_box.find(find)
            end_point = start_point + len(find)
            found_at = [start_point, end_point]
            if found_at in used_points:
                while found_at in used_points:
                    reduced_text = text_in_box[end_point:]
                    start_point = end_point + reduced_text.find(find)
                    end_point = start_point + len(find)
                    found_at = [start_point, end_point]
            used_points.append(found_at)
            text_to_startpoint = text_in_box[:start_point]
            text_to_endpoint = text_in_box[:end_point]
            start_line = text_to_startpoint.count("\n") + 1
            end_line = text_to_endpoint.count("\n") + 1
            if "\n" in text_to_startpoint:
                line_start_point = len(text_in_box[text_to_startpoint.rfind("\n") + 1: start_point])
            else:
                line_start_point = start_point
            if "\n" in text_to_endpoint:
                line_end_point = len(text_in_box[text_to_endpoint.rfind("\n") + 1: end_point])
            else:
                line_end_point = end_point
            start_point = Decimal(f"{start_line}.{line_start_point}")
            end_point = Decimal(f"{end_line}.{line_end_point}")
            text_box.tag_add("italics", start_point, end_point)

em_text = em_points(text)
clean_text = em_text[0]
em_list = em_text[1]

text_box = Text(root, width=80, height=5, font=("Courier", 12))
text_box.insert(1.0, clean_text)
italicize_text(text_box, em_list)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM