简体   繁体   English

将 word(docx) 文档与 DOCX4J 合并:如何复制图像?

[英]Merge word(docx) documents with DOCX4J: how to copy images?

I need to merge two (or more, but let's stick to two) word documents (docx) with docx4j.我需要使用 docx4j 合并两个(或更多,但让我们坚持两个)word 文档 (docx)。 My approach to merge is copy all body children from one document and append to another.我的合并方法是从一个文档中复制所有正文子项并附加到另一个文档中。 Then, I just rearrange some stuff.然后,我只是重新排列一些东西。 I have been using it for two years, and it is fine for my purpose.我已经使用它两年了,这对我的目的来说很好。

Here is a simple example:这是一个简单的例子:

first.docx = simple text first.docx = 简单文本
second.docx = simple text + image second.docx = 简单文本 + 图像

    File first = new File("first.docx");
    File second = new File("second.docx");

    WordprocessingMLPackage f = WordprocessingMLPackage.load(first);
    WordprocessingMLPackage s = WordprocessingMLPackage.load(second);

    List body = s.getMainDocumentPart().getJAXBNodesViaXPath("//w:body", false);
    for(Object b : body){
        List filhos = ((org.docx4j.wml.Body)b).getContent();
        for(Object k : filhos)
            f.getMainDocumentPart().addObject(k);
    }

    List blips = s.getMainDocumentPart().getJAXBNodesViaXPath("//a:blip", false);
    for(Object el : blips){
        try {
            CTBlip blip = (CTBlip) el;

            RelationshipsPart parts = s.getMainDocumentPart().getRelationshipsPart();
            Relationship rel = parts.getRelationshipByID(blip.getEmbed());

            RelationshipsPart docRels = f.getMainDocumentPart().getRelationshipsPart();

            rel.setId(null);
            docRels.addRelationship(rel);
            blip.setEmbed(rel.getId());

            f.getMainDocumentPart().addTargetPart(s.getParts().getParts().get(new PartName("/word/"+rel.getTarget())));

        } catch (Exception ex){}
    }

    File saved = new File("saved.docx");
    f.save(saved);

    Desktop.getDesktop().open(saved);

The problem is when I save.问题是我保存的时候。 This errors come out:出现这个错误:

    org.docx4j.openpackaging.exceptions.Docx4JException: Failed to add parts from relationships of /
at org.docx4j.openpackaging.io3.Save.addPartsFromRelationships(Save.java:390)
at org.docx4j.openpackaging.io3.Save.save(Save.java:192)
at org.docx4j.openpackaging.packages.OpcPackage.save(OpcPackage.java:441)
at org.docx4j.openpackaging.packages.OpcPackage.save(OpcPackage.java:406)

Caused by: org.docx4j.openpackaging.exceptions.Docx4JException: Failed to add parts from relationships of /word/document.xml
at org.docx4j.openpackaging.io3.Save.addPartsFromRelationships(Save.java:390)
at org.docx4j.openpackaging.io3.Save.savePart(Save.java:442)
at org.docx4j.openpackaging.io3.Save.addPartsFromRelationships(Save.java:385)
... 4 more

Caused by: org.docx4j.openpackaging.exceptions.Docx4JException: Failed to put binary part
at
org.docx4j.openpackaging.io3.stores.ZipPartStore.saveBinaryPart(ZipPartStore.java:398)
at org.docx4j.openpackaging.io3.Save.savePart(Save.java:418)
at org.docx4j.openpackaging.io3.Save.addPartsFromRelationships(Save.java:385)
... 6 more

Caused by: java.io.IOException: part '/word/media/image1.jpg' not found
at
org.docx4j.openpackaging.io3.stores.ZipPartStore.saveBinaryPart(ZipPartStore.java:361)
... 8 more

Exception in thread "main" org.docx4j.openpackaging.exceptions.Docx4JException: Failed to add parts from relationships of /
at org.docx4j.openpackaging.io3.Save.addPartsFromRelationships(Save.java:390)
at org.docx4j.openpackaging.io3.Save.save(Save.java:192)
at org.docx4j.openpackaging.packages.OpcPackage.save(OpcPackage.java:441)
at org.docx4j.openpackaging.packages.OpcPackage.save(OpcPackage.java:406)

Caused by: org.docx4j.openpackaging.exceptions.Docx4JException: Failed to add parts from relationships of /word/document.xml
at org.docx4j.openpackaging.io3.Save.addPartsFromRelationships(Save.java:390)
at org.docx4j.openpackaging.io3.Save.savePart(Save.java:442)
at org.docx4j.openpackaging.io3.Save.addPartsFromRelationships(Save.java:385)
... 4 more

Caused by: org.docx4j.openpackaging.exceptions.Docx4JException: Failed to put binary part
at
org.docx4j.openpackaging.io3.stores.ZipPartStore.saveBinaryPart(ZipPartStore.java:398)
at org.docx4j.openpackaging.io3.Save.savePart(Save.java:418)
at org.docx4j.openpackaging.io3.Save.addPartsFromRelationships(Save.java:385)
... 6 more

Caused by: java.io.IOException: part '/word/media/image1.jpg' not found
at
org.docx4j.openpackaging.io3.stores.ZipPartStore.saveBinaryPart(ZipPartStore.java:361)
... 8 more

Any light here to solve it?这里有什么光可以解决吗?

1) I dont want altchunk, it sucks. 1)我不想要altchunk,它糟透了。 2) The commercial (Enterprise) version of docx4j can do it, but I'm looking for FOSS. 2)docx4j的商业(企业)版本可以做到,但我正在寻找FOSS。

Thanks谢谢

Manipulate the blips in s before adding them to f.在将它们添加到 f 之前操作 s 中的 blip。 In other words, swap the order of your for loops.换句话说,交换 for 循环的顺序。

Then in your blip manipulation, what you need to do is:然后在您的 blip 操作中,您需要做的是:

  • get the part of interest得到感兴趣的部分
  • Rel rel = f.getMainDocumentPart().addTargetPart rel rel = f.getMainDocumentPart().addTargetPart
  • update the relId in the blip from rel.getId从 rel.getId 更新 blip 中的 relId

Now add the content of s to f.现在将 s 的内容添加到 f。 You could just use addAll to do the job without the nested loop.您可以使用 addAll 来完成没有嵌套循环的工作。 Also there's only one body object, so you don't need the outer loop.而且只有一个 body 对象,所以你不需要外循环。

Obviously this answer is limited to handling only CTBlip, and then only embedded ones.显然,这个答案仅限于处理 CTBlip,然后只处理嵌入的。 There's a lot more to a complete solution to merging docx files...合并 docx 文件的完整解决方案还有很多……

Note: I wrote the code for merging documents in docx4j Enterprise注:我在docx4j Enterprise中编写了合并文档的代码

Here we combined together from the posts a real working code:在这里,我们从帖子中结合了一个真正的工作代码:

List<Object> blips = s.getMainDocumentPart().getJAXBNodesViaXPath("//a:blip", false);
for (Object el : blips) {
    try {

        CTBlip blip = (CTBlip) el;
        RelationshipsPart parts = s.getMainDocumentPart().getRelationshipsPart();
        Relationship rel = parts.getRelationshipByID(blip.getEmbed());
        Part part = parts.getPart(rel);

        if (part instanceof ImagePngPart)
            System.out.println(((ImagePngPart) part).getBytes());
        if (part instanceof ImageJpegPart)
            System.out.println(((ImageJpegPart) part).getBytes());
        if (part instanceof ImageBmpPart)
            System.out.println(((ImageBmpPart) part).getBytes());
        if (part instanceof ImageGifPart)
            System.out.println(((ImageGifPart) part).getBytes());
        if (part instanceof ImageEpsPart)
            System.out.println(((ImageEpsPart) part).getBytes());
        if (part instanceof ImageTiffPart)
            System.out.println(((ImageTiffPart) part).getBytes());

        Relationship newrel = f.getMainDocumentPart().addTargetPart(part, AddPartBehaviour.RENAME_IF_NAME_EXISTS);

        blip.setEmbed(newrel.getId());
        f.getMainDocumentPart().addTargetPart(s.getParts().getParts().get(new PartName("/word/" + rel.getTarget())));

    } catch (Exception ex) {
        ex.printStackTrace();
    }
}

this snippet takes images with decoration in docx from s document to f document.此代码段将 docx 中带有装饰的图像从 s 文档转换为 f 文档。 Sysouts needed because of a reason I have forgotten but without it lib can not determine the mime of the image.由于我忘记的原因,需要 Sysouts,但没有它,lib 无法确定图像的 mime。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM