简体   繁体   English

如何使用apache POI读取和替换书签值

[英]How to read and replace bookmark values with apache POI

i'm a complete novice with apache POI and i already tried several things. 我是apache POI的完全新手,我已经尝试了几件事。 My problem is that i have a few bookmarks in a docx-File and i want to replace the value of them. 我的问题是我在docx文件中有一些书签,我想替换它们的值。

i already got so far that i add the text to the bookmark, but the previous value is still there 我已经到了可以将文本添加到书签的程度,但是以前的值仍然存在

my code: 我的代码:

InputStream fis = new FileInputStream(fileName);
XWPFDocument document = new XWPFDocument(fis);
List<XWPFParagraph> paragraphs = document.getParagraphs();
for (XWPFParagraph paragraph : paragraphs)
{
    //Here you have your paragraph; 
    CTP ctp = paragraph.getCTP(); 
    // Get all bookmarks and loop through them 
    List<CTBookmark> bookmarks = ctp.getBookmarkStartList(); 
    for(CTBookmark bookmark : bookmarks) 
    { 
         if(bookmark.getName().equals("Firma1234"))
         {
             System.out.println(bookmark.getName());
             XWPFRun run = paragraph.createRun();
             run.setText(lcFirma);
                      ctp.getDomNode().insertBefore(run.getCTR().getDomNode(), bookmark.getDomNode());
         }
    }   
}
OutputStream out = new FileOutputStream(output);
document.write(out);
document.close();
out.close();    

the value of "lcFirma" is "Firma" the value of the Bookmark is "Testmark" my docx-File before: “ lcFirma”的值是“ Firma”,书签的值是“ Testmark”,我的docx文件之前:

Testmark -> name=Firma1234

my docx-File after: 我的docx文件之后:

FirmaTestmark

like i said the text is inserted before the value of the bookmark instead of replacing it, how do i replace the text instead? 就像我说的那样,在书签的值之前插入文本而不是替换它,我该如何替换文本?

Greetings, 问候,

Kevin 凯文

I also had similar requirement of setting the "Default text" field of a .docx bookmark. 我还对设置.docx书签的“默认文本”字段有类似的要求。 I was not able to do so, so, I did this as a workaround : Replaced the entire paragraph containing the bookmark with text. 我无法执行此操作,因此,我将其作为一种解决方法:将包含书签的整个段落替换为文本。 So, instead of the bookmark being populated with a default text, I had a paragraph that held the bookmarked text. 因此,我没有用默认文本填充书签,而是使用了一个段落来保存添加了书签的文本。 In my case, the .docx had to finally converted to a .pdf file, so the absence of bookmark did not matter, but the presence of correct text was more important. 以我为例,.docx最终必须转换为.pdf文件,因此没有书签并不重要,但是正确文本的存在更为重要。

This is how I did it with Apache POI : 这就是我使用Apache POI的方法:

import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.InputStream;
import java.io.OutputStream;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

import org.apache.commons.lang3.StringUtils;
import org.apache.poi.util.TempFileCreationStrategy;
import org.apache.poi.xdgf.usermodel.section.geometry.RelMoveTo;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.usermodel.XWPFParagraph;
import org.apache.poi.xwpf.usermodel.XWPFRun;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTBookmark;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTP;
import org.w3c.dom.DOMException;
import org.w3c.dom.Document;
import org.w3c.dom.NamedNodeMap;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.w3c.dom.UserDataHandler;

/**
 * 
 * @author binita.bharati@gmail.com
 * 
 * This code will replace bookmark with plain text. A bookmark is seen as "Text Form Field" in a .docx file.
 *
 */


public class BookmarkReplacer {

    public static void main(String[] args) throws Exception {
        replaceBookmark();

    }

    private static String replaceBookmarkedPara(String input, String bookmarkTxt) {
        char[] tmp = input.toCharArray();
        StringBuilder sb = new StringBuilder();
        int bookmarkedCharCount = 0;
        for (int i = 0 ; i < tmp.length ; i++) {
            int asciiCode = tmp[i];
            if (asciiCode == 8194) {
                bookmarkedCharCount ++;
                if (bookmarkedCharCount == 5) {
                    sb.append(bookmarkTxt);
                }
            }
            else {
                sb.append(tmp[i]);
            }
        }
        return sb.toString();
    }

    private static void removeAllRuns(XWPFParagraph paragraph) {
        int size = paragraph.getRuns().size();
        for (int i = 0; i < size; i++) {
            paragraph.removeRun(0);
        }
    }

    private static void insertReplacementRuns(XWPFParagraph paragraph, String replacedText) {
        String[] replacementTextSplitOnCarriageReturn = StringUtils.split(replacedText, "\n");

        for (int j = 0; j < replacementTextSplitOnCarriageReturn.length; j++) {
            String part = replacementTextSplitOnCarriageReturn[j];

            XWPFRun newRun = paragraph.insertNewRun(j);
            newRun.setText(part);

            if (j+1 < replacementTextSplitOnCarriageReturn.length) {
                newRun.addCarriageReturn();
            }
        }       
    }

    public static void replaceBookmark () throws Exception 
    {
    InputStream fis = new FileInputStream("C:\\input.docx");
    XWPFDocument document = new XWPFDocument(fis);
    List<XWPFParagraph> paragraphs = document.getParagraphs();
    for (XWPFParagraph paragraph : paragraphs)
    {
        //Here you have your paragraph; 
        CTP ctp = paragraph.getCTP(); 
        // Get all bookmarks and loop through them 
        List<CTBookmark> bookmarks = ctp.getBookmarkStartList(); 
        for(CTBookmark bookmark : bookmarks) 
        { 
             if(bookmark.getName().equals("data_incipit") || bookmark.getName().equals("incipit_Codcli")
                     || bookmark.getName().equals("Incipit_titolo"))
             {

                 String paraText = paragraph.getText();
                 System.out.println("paraText = "+paraText +" for bookmark name "+bookmark.getName());
                 String replacementText = replaceBookmarkedPara(paraText, "haha");
                 removeAllRuns(paragraph);
                 insertReplacementRuns(paragraph, replacementText);

             }
        }   
    }
    OutputStream out = new FileOutputStream("C:\\output.docx");
    document.write(out);
    document.close();
    out.close();  
}
}

Try below code 试试下面的代码

private List<XWPFParagraph> collectParagraphs()
    {
        List<XWPFParagraph> paragraphs = new ArrayList<>();
        paragraphs.addAll(this.document.getParagraphs());

        for (XWPFTable table : this.document.getTables())
        {
            for (XWPFTableRow row : table.getRows())
            {
                for (XWPFTableCell cell : row.getTableCells())
                    paragraphs.addAll(cell.getParagraphs());
            }
        }
        return paragraphs;
    }

    public List<String> getBookmarkNames()
    {
        List<String> bookmarkNames = new ArrayList<>();
        Iterator<XWPFParagraph> paraIter = null;
        XWPFParagraph para = null;
        List<CTBookmark> bookmarkList = null;
        Iterator<CTBookmark> bookmarkIter = null;
        CTBookmark bookmark = null;
        XWPFRun run = null;

        // Get an Iterator for the XWPFParagraph object and step through them
        // one at a time.
        paraIter = collectParagraphs().iterator();
        while (paraIter.hasNext())
        {
            para = paraIter.next();

            // Get a List of the CTBookmark object sthat the paragraph
            // 'contains' and step through these one at a time.
            bookmarkList = para.getCTP().getBookmarkStartList();
            bookmarkIter = bookmarkList.iterator();
            while (bookmarkIter.hasNext())
            {
                bookmark = bookmarkIter.next();
                bookmarkNames.add(bookmark.getName());
            }
        }
        return bookmarkNames;
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM