简体   繁体   English

PDFClown复制注释,然后对其进行操作

[英]PDFClown Copy annotations and then manipulate them

I have the need to copy annotations from one PDF File to another. 我需要将注释从一个PDF文件复制到另一个。 I have used the excellent PDFClown library but unable to manipulate things like color,rotation etc. Is this possible? 我使用了出色的PDFClown库,但无法处理颜色,旋转等问题。这可能吗? I can see the baseobject information but also unsure how to manipulate that directly. 我可以看到基础对象信息,但也不确定如何直接操作它。

I can copy the appearance via cloning appearance but can't "edit" it. 我可以通过复制外观来复制外观,但是不能“编辑”它。

Thanks in advance. 提前致谢。 Alex 亚历克斯

PS If Stephano the author is listeing ,is project dead? PS:如果斯蒂芬诺(Stephano)作者在听,项目是否死亡?

On annotations in general and Callout annotations in particular 关于一般注释,尤其是标注注释

I looked into it a bit, and I'm afraid there is not much you can deterministically manipulate for arbitrary inputs using high level methods. 我仔细研究了一下,恐怕您无法使用高级方法确定性地操作任意输入。 The reason is that there are numerous alternative ways to set the appearance of a Callout annotation and PDF Clown only supports the less prioritized ways with explicit high level methods. 原因是有许多其他方法可以设置标注注释的外观,而PDF Clown仅支持显式高级方法中优先级较低的方法。 From high priority downwards 从高优先级向下

  • An explicit appearance in an AP stream. AP流中的显式外观。 If it is given, it is used, ignoring whether this appearance looks like a Callout annotation at all, let alone like one defined by the other Callout properties. 如果给出,则使用它,而忽略这种外观是否看起来完全像一个Callout注释,更不用说像其他Callout属性定义的外观了。

    PDF Clown does not create an appearance for callout annotations from the other values yet, let alone update existing appearances to follow up to some specific attribute (eg Color ) change. PDF Clown尚未从其他值创建标注注释的外观,更不用说更新现有外观以跟上某些特定属性(例如Color )的更改。 For ISO 32000-2 support, PDF Clown here will have to improve as appearance streams have become mandatory. 对于ISO 32000-2支持,此处的PDF小丑将必须改进,因为外观流已成为必需。

    If it exists, you can retrieve the appearance using getAppearance() but you only get a FormXObject with its low level drawing instructions, nothing Callout specific. 如果存在,则可以使用getAppearance()来检索外观,但是您只会获得带有低级绘制指令的FormXObject ,而没有任何特定于标注的形式。

    One thing you can manipulate quite easily given a FormXObject , though, you can rotate or skew the appearance quite easily by setting its Matrix accordingly, eg 给定一个FormXObject可以很容易地操作一件事,但是可以通过相应地设置其Matrix来相当容易地旋转或倾斜外观,例如

     annotation.getAppearance().getNormal().get(null).setMatrix(AffineTransform.getRotateInstance(100, 10)); 
  • A rich text string in the RC string or stream. RC字符串或流中的富文本字符串。 Unless an appearance is given, the text in the Callout text box is generated from this rich text datum (rich text here uses a XHTML 1.0 subset for formatting). 除非给出外观,否则“标注”文本框中的文本将从该富文本数据生成(此处的富文本使用XHTML 1.0子集进行格式化)。

    PDF Clown does not create a rich text representation of the Callout text yet, let alone update existing ones to follow up to some specific attribute (eg Color ) change.. PDF Clown尚未创建标注文本的富文本表示形式,更不用说更新现有的以遵循某些特定属性(例如Color )的更改了。

    If it exists, you can retrieve the rich text by low level access using getBaseDataObject().get(PdfName.RC) , change this string or stream, and set it again using getBaseDataObject().put(PdfName.RC, ...) . 如果存在,则可以使用getBaseDataObject().get(PdfName.RC)通过低级别访问来检索富文本,更改此字符串或流,然后使用getBaseDataObject().put(PdfName.RC, ...) Similarly you can retrieve, manipulate, and set the rich text default style string using its name PdfName.DS instead. 同样,您可以使用其名称PdfName.DS来检索,操纵和设置RTF默认样式字符串。

  • A number of different settings for separate aspects used to build the Callout from in the absence of appearance stream and (as far as the text content is concerned) rich text string. 在缺少外观流和(就文本内容而言)富文本字符串的情况下,用于从不同方面构建标注的许多不同设置。

    PDF Clown supports (many of) these attributes, in particular if you cast the cloned annotation to StaticNote , eg the opacity CA using get/set/withAlpha , the border Border / BS using get/set/withBorder , the background color C using get/set/withColor , ... PDF小丑支持(许多)这些属性,特别是如果你投的克隆注释StaticNote使用,如不透明度CA get/set/withAlpha使用,边境边境 / BS get/set/withBorder ,使用背景颜色C get/set/withColor ,...

    It by the way has an error in its line ending style LE support: Apparently the code for the Line annotation LE property was copied without checking; 顺便说一下,它的行结束样式LE支持有一个错误:显然,复制了Line注释LE属性的代码而未检查; unfortunately that attribute there follows a different syntax... 不幸的是,那里的属性遵循不同的语法...

Your tasks 你的任务

Concerning the attributes you stated you want to change, therefore, 因此,关于您要更改的属性,

  • Rotation : There is no rotation attribute in the Callout annotation per se (other than the flag whether or not to follow the page rotation). 轮换 :标注注释本身本身没有轮换属性(除了是否跟随页面轮换的标记外)。 Thus, you cannot set a rotation as a simple annotation attribute. 因此,您不能将旋转设置为简单的注释属性。 If the source annotation does have an appearance stream, though, you can manipulate its Matrix to rotate it inside the annotation rectangle, see above. 但是,如果源注释确实具有外观流,则可以操纵其Matrix使其在注释矩形内旋转,请参见上文。

  • Border color and font : If your Callout has an appearance stream, you can try and parse its content using a ContentScanner and manipulate color and font setting operations. 边框颜色字体 :如果您的标注具有外观流,则可以尝试使用ContentScanner解析其内容并操纵颜色和字体设置操作。 Otherwise, if rich text information is set, for the font you can try and parse the rich text using some XML parser and manipulate font style attributes. 否则,如果设置了富文本信息,则可以使用某些XML解析器尝试为字体解析富文本并处理字体样式属性。 Otherwise, you can parse the default appearance DA string and manipulate its font and color setting instructions. 否则,您可以解析默认外观DA字符串并操纵其字体和颜色设置说明。

Some example code 一些示例代码

I created a file with an example Callout annotation using Adobe Acrobat: Callout-Yellow.pdf . 我使用Adobe Acrobat创建了一个带有示例标注注释的文件: Callout-Yellow.pdf It contains an appearance stream, rich text, and simple attributes, so one can use this file for example manipulations at different levels. 它包含外观流,富文本和简单属性,因此可以使用此文件进行不同级别的操作。

The I applied this code to it with different values for keepAppearanceStream and keepRichText (you didn't mention whether you used PDF Clown for Java or .Net; so I chose Java; a port to .Net should be trivial, though...): 我将代码应用到了它,并为keepAppearanceStreamkeepRichText使用了不同的值(您没有提到使用的是Java还是.Net的PDF Clown;所以我选择了Java;但是.NET的端口应该很简单)。 :

boolean keepAppearanceStream = ...;
boolean keepRichText = ...;

try (   InputStream sourceResource = GET_STREAM_FOR("Callout-Yellow.pdf");
        InputStream targetResource = GET_STREAM_FOR("test123.pdf");
        org.pdfclown.files.File sourceFile = new org.pdfclown.files.File(sourceResource);
        org.pdfclown.files.File targetFile = new org.pdfclown.files.File(targetResource); ) {
    Document sourceDoc = sourceFile.getDocument();
    Page sourcePage = sourceDoc.getPages().get(0);
    Annotation<?> sourceAnnotation = sourcePage.getAnnotations().get(0);

    Document targetDoc = targetFile.getDocument();
    Page targetPage = targetDoc.getPages().get(0);

    StaticNote targetAnnotation = (StaticNote) sourceAnnotation.clone(targetDoc);

    if (keepAppearanceStream) {
        // changing properties of an appearance
        // rotating the appearance in the appearance rectangle
        targetAnnotation.getAppearance().getNormal().get(null).setMatrix(AffineTransform.getRotateInstance(100, 10));
    } else {
        // removing the appearance to allow lower level properties changes
        targetAnnotation.setAppearance(null);
    }

    // changing text background color
    targetAnnotation.setColor(new DeviceRGBColor(0, 0, 1));

    if (keepRichText) {
        // changing rich text properties
        PdfString richText = (PdfString) targetAnnotation.getBaseDataObject().get(PdfName.RC);
        String richTextString = richText.getStringValue();
        // replacing the font family
        richTextString = richTextString.replaceAll("font-family:Helvetica", "font-family:Courier");
        richText = new PdfString(richTextString);
        targetAnnotation.getBaseDataObject().put(PdfName.RC, richText);
    } else {
        targetAnnotation.getBaseDataObject().remove(PdfName.RC);
        targetAnnotation.getBaseDataObject().remove(PdfName.DS);
    }

    // changing default appearance properties
    PdfString defaultAppearance = (PdfString) targetAnnotation.getBaseDataObject().get(PdfName.DA);
    String defaultAppearanceString = defaultAppearance.getStringValue();
    // replacing the font
    defaultAppearanceString = defaultAppearanceString.replaceFirst("Helv", "HeBo");
    // replacing the text and line color
    defaultAppearanceString = defaultAppearanceString.replaceFirst(". . . rg", ".5 g");
    defaultAppearance = new PdfString(defaultAppearanceString);
    targetAnnotation.getBaseDataObject().put(PdfName.DA, defaultAppearance);

    // changing the text value
    PdfString contents = (PdfString) targetAnnotation.getBaseDataObject().get(PdfName.Contents);
    String contentsString = contents.getStringValue();
    contentsString = contentsString.replaceFirst("text", "text line");
    contents = new PdfString(contentsString);
    targetAnnotation.getBaseDataObject().put(PdfName.Contents, contents);

    // change the line width and style
    targetAnnotation.setBorder(new Border(0, new LineDash(new double[] {3, 2})));

    targetPage.getAnnotations().add(targetAnnotation);

    targetFile.save(new File(RESULT_FOLDER, "test123-withCalloutCopy.pdf"),  SerializationModeEnum.Standard);
}

( CopyCallOut test testCopyCallout ) CopyCallOut测试testCopyCallout

Beware, the code only has proof-of-concept quality: For arbitrary PDFs you cannot simply expect a string replace of "font-family:Helvetica" by "font-family:Courier" or "Helv" by "HeBo" or ". . . rg" by ".5 g" to do the job: fonts can be given using different style attributes or names, and different coloring instructions may be used. 请注意,该代码仅具有概念验证的质量:对于任意PDF,您不能仅仅期望将字符串“ font-family:Helvetica”替换为“ font-family:Courier”或“ Helv”替换为“ HeBo”或“。 “ .rg”和“ .5g”之间的区别:可以使用不同的样式属性或名称来指定字体,并且可以使用不同的着色说明。

Screenshots in Adobe Adobe中的屏幕截图

  • The original file: 原始文件:

    原始注释

  • keepAppearanceStream = true : keepAppearanceStream = true

    保持外观流但旋转

  • keepAppearanceStream = false and keepRichText = true : keepAppearanceStream = falsekeepRichText = true

    外观流下降,富文本保留但被操纵

  • keepAppearanceStream = false and keepRichText = false : keepAppearanceStream = falsekeepRichText = false

    带有外观流和丰富文本,并且操作了简单属性

As a post commment Mkl Your great advice is really helpful for when creating new annotations. 作为帖子Mkl,在创建新注释时,您的出色建议确实很有帮助。 I did apply the following as a method of "copying" an existing annotation where note is the "cloned" annotation ad baseAnnotation the source 我确实将以下内容用作“复制”现有注释的方法,其中note是“克隆的”注释广告库。

 foreach (PdfName t in baseAnnotation.BaseDataObject.Keys)
  {
                if (t.Equals(PdfName.DA) || t.Equals(PdfName.DS) || t.Equals(PdfName.RC) || t.Equals(PdfName.Rotate))
                {
                    note.BaseDataObject[t] = baseAnnotation.BaseDataObject[t];
                }
            }

Thanks again 再次感谢

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM