简体   繁体   English

PyMuPDF 如何删除注释?

[英]PyMuPDF how do I remove annotations?

I am using PyMuPDF and trying to loop through a list of strings and highlight them before taking an image and moving to the next string.我正在使用 PyMuPDF 并尝试遍历字符串列表并在拍摄图像并移动到下一个字符串之前突出显示它们。

The code below does what I need but the annotation remains after each loop and I would like to remove them after the image is taken.下面的代码可以满足我的需要,但注释在每个循环之后仍然存在,我想在拍摄图像后删除它们。
An example image below shows the word "command" highlighted but the previous strings "Images" and "filename" are still highlighted, since I will have hundreds of these images compiled into a report, I would like to make it stand out more clearly.下面的示例图像显示“命令”一词突出显示,但之前的字符串“图像”和“文件名”仍然突出显示,因为我会将数百个这样的图像编译成报告,我想让它更清楚地突出。

Is there something like page.remove(highlight)?有没有类似 page.remove(highlight) 的东西?

pymupdf 输出示例图像

for pi in range(doc.pageCount):
    page = doc[pi]
    for tag in text_list:

        text = tag
        text_instances = page.searchFor(text)

        five_percent_height = (page.rect.br.y - page.rect.tl.y)*0.05
        five_percent_width = (page.rect.br.x - page.rect.tl.x)*0.05

        for inst in text_instances:
            inst_counter += 1
            highlight = page.addSquigglyAnnot(inst)            

            tl_pt = fitz.Point(max(page.rect.tl.x, inst.tl.x - five_percent_width), max(page.rect.tl.y, inst.tl.y - five_percent_height))
            br_pt = fitz.Point(min(page.rect.br.x, inst.br.x + five_percent_width), min(page.rect.br.y, inst.br.y + five_percent_height))

            hl_clip = fitz.Rect(tl_pt, br_pt)

            zoom_mat = fitz.Matrix(4, 4)
            pix = page.getPixmap(matrix=zoom_mat, clip = hl_clip)
            >I want to remove the annotation here

I found an acceptable solution was to just set the opacity to 0% after taking the screenshot.我发现一个可接受的解决方案是在截屏后将不透明度设置为 0%。

pix = page.getPixmap(matrix=zoom_mat, clip = hl_clip)
highlight.setOpacity(0)
highlight.update()

Do this:做这个:

annot = page.firstAnnot
while annot:
    annot = page.deleteAnnot(annot)

The method delivers the annotation following the deleted one.该方法在删除的注释之后传递注释。

Jorj's approach is good. Jorj 的方法很好。 However, from the documentation there are other options:但是,从文档中还有其他选项:

https://pymupdf.readthedocs.io/en/latest/faq.html#how-to-read-and-update-pdf-objects https://pymupdf.readthedocs.io/en/latest/faq.html#how-to-read-and-update-pdf-objects

This method can also be used to remove a key from the xref dictionary by setting its value to null: The following will remove the rotation specification from the page: doc.xref_set_key(page.xref, "Rotate", "null") .此方法还可用于通过将键的值设置为 null 从外部参照字典中删除键:以下将从页面中删除旋转规范: doc.xref_set_key(page.xref, "Rotate", "null") Similarly, to remove all links, annotations and fields from a page, use doc.xref_set_key(page.xref, "Annots", "null") .同样,要从页面中删除所有链接、注释和字段,请使用doc.xref_set_key(page.xref, "Annots", "null") Because Annots by definition is an array, setting en empty array with the statement doc.xref_set_key(page.xref, "Annots", "[]") would do the same job in this case.因为 Annots 根据定义是一个数组,所以在这种情况下,使用语句doc.xref_set_key(page.xref, "Annots", "[]")设置一个空数组将完成相同的工作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM