简体   繁体   English

使用Apache POI替换docx文本框中的文本

[英]Replace text in text box of docx by using Apache POI

I am using Apache POI to replace words of docx.我正在使用 Apache POI 来替换 docx 的单词。 For a normal paragraph, I success to use XWPFParagraph and XWPFRun to replace the words.对于普通的段落,我成功地使用了 XWPFParagraph 和 XWPFRun 来替换单词。 Then I tried to replace words in text box.然后我尝试替换文本框中的单词。 I referenced this https://stackoverflow.com/a/25877256 to get text in text box.我参考了这个https://stackoverflow.com/a/25877256来获取文本框中的文本。 I success to print the text in console.我成功地在控制台中打印了文本。 However, I failed to replace words in text box.但是,我未能替换文本框中的单词。 Here are some of my codes:这是我的一些代码:

    for (XWPFParagraph paragraph : doc.getParagraphs()) {
        XmlObject[] textBoxObjects =  paragraph.getCTP().selectPath("declare namespace w='http://schemas.openxmlformats.org/wordprocessingml/2006/main' declare namespace wps='http://schemas.microsoft.com/office/word/2010/wordprocessingShape' .//*/wps:txbx/w:txbxContent");
            for (int i =0; i < textBoxObjects.length; i++) {
                XWPFParagraph embeddedPara = null;
                try {
                XmlObject[] paraObjects = textBoxObjects[i].
                    selectChildren(
                    new QName("http://schemas.openxmlformats.org/wordprocessingml/2006/main", "p"));

                for (int j=0; j<paraObjects.length; j++) {
                    embeddedPara = new XWPFParagraph(CTP.Factory.parse(paraObjects[j].xmlText()), paragraph.getBody());
                    List<XWPFRun> runs = embeddedPara.getRuns();
                    for (XWPFRun r : runs) {
                        String text = r.getText(0);
                        if (text != null && text.contains(someWords)) {
                            text = text.replace(someWords, "replaced");
                            r.setText(text, 0);
                        }
                    }
                } 
                } catch (XmlException e) {
                //handle
                }
            }
    }

I think the problem is that I created a new XWPFParagraph embeddedPara and it's replacing the words of embeddedPara but not the origin paragraph.我认为问题在于我创建了一个新的 XWPFParagraph embeddingPara 并且它替换了 embeddingPara 的单词而不是原始段落。 So after I write in a file, the words still not change.所以在我写入文件后,单词仍然没有改变。

How can I read and replace the words in the text box without creating a new XWPFParagraph?如何在不创建新的 XWPFParagraph 的情况下读取和替换文本框中的单词?

The problem occurs because the Word text boxes may be contained in multiple different XmlObjects dependent of the Word version.出现此问题是因为Word文本框可能包含在依赖于Word版本的多个不同XmlObjects中。 Those XmlObjects may also be in very different name spaces.这些XmlObjects也可能位于非常不同的名称空间中。 So the selectChildren cannot following the name space route and so it will return a XmlAnyTypeImpl .因此selectChildren不能遵循名称空间路由,因此它将返回XmlAnyTypeImpl

What all text box implementatrion have in common is that their runs are in the path .//*/w:txbxContent/w:p/w:r .所有文本框实现的共同点是它们的运行都在路径.//*/w:txbxContent/w:p/w:r So we can using a XmlCursor which selects that path.所以我们可以使用一个XmlCursor来选择该路径。 Then we collect all selected XmlObjects in a List<XmlObject> .然后我们在List<XmlObject>收集所有选定的XmlObjects Then we parse CTR s from those objects, which are of course only CTR s outside the document context.然后我们从这些对象中解析CTR s,这些对象当然只是文档上下文之外的CTR s。 But we can creating XWPFRuns from those, do the replacing there and then set the XML content of those XWPFRuns back to the objects.但是我们可以从这些创建XWPFRuns ,在那里进行替换,然后set这些XWPFRuns的 XML 内容set回对象。 After this we have the objects containing the replaced content.在此之后,我们有包含替换内容的对象。

Example:示例:

在此处输入图片说明

import java.io.FileOutputStream;
import java.io.FileInputStream;

import org.apache.poi.xwpf.usermodel.*;

import org.apache.xmlbeans.XmlObject;
import org.apache.xmlbeans.XmlCursor;

import  org.openxmlformats.schemas.wordprocessingml.x2006.main.CTR;

import java.util.List;
import java.util.ArrayList;

public class WordReplaceTextInTextBox {

 public static void main(String[] args) throws Exception {

  XWPFDocument document = new XWPFDocument(new FileInputStream("WordReplaceTextInTextBox.docx"));

  String someWords = "TextBox";

  for (XWPFParagraph paragraph : document.getParagraphs()) {
   XmlCursor cursor = paragraph.getCTP().newCursor();
   cursor.selectPath("declare namespace w='http://schemas.openxmlformats.org/wordprocessingml/2006/main' .//*/w:txbxContent/w:p/w:r");

   List<XmlObject> ctrsintxtbx = new ArrayList<XmlObject>();

   while(cursor.hasNextSelection()) {
    cursor.toNextSelection();
    XmlObject obj = cursor.getObject();
    ctrsintxtbx.add(obj);
   }
   for (XmlObject obj : ctrsintxtbx) {
    CTR ctr = CTR.Factory.parse(obj.xmlText());
    //CTR ctr = CTR.Factory.parse(obj.newInputStream());
    XWPFRun bufferrun = new XWPFRun(ctr, (IRunBody)paragraph);
    String text = bufferrun.getText(0);
    if (text != null && text.contains(someWords)) {
     text = text.replace(someWords, "replaced");
     bufferrun.setText(text, 0);
    }
    obj.set(bufferrun.getCTR());
   }
  }

  FileOutputStream out = new FileOutputStream("WordReplaceTextInTextBoxNew.docx");
  document.write(out);
  out.close();
  document.close();
 }
}

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM