[英]How to edit MS Word documents using Java?
I do have few Word templates, and my requirement is to replace some of the words/place holders in the document based on the user input, using Java.我的 Word 模板很少,我的要求是使用 Java 根据用户输入替换文档中的一些单词/占位符。 I tried lot of libraries including 2-3 versions of docx4j
but nothing work well, they all just didn't do anything!我尝试了很多库,包括 2-3 个版本的docx4j
但没有任何效果,它们都没有做任何事情!
I know this question has been asked before, but I tried all options I know.我知道以前有人问过这个问题,但我尝试了我知道的所有选项。 So, using what java library I can "really" replace/edit these templates?那么,使用什么 java 库我可以“真正”替换/编辑这些模板? My preference goes to the "easy to use / Few line of codes" type libraries.我更喜欢“易于使用/几行代码”类型库。
I am using Java 8 and my MS Word templates are in MS Word 2007.我使用的是 Java 8,我的 MS Word 模板在 MS Word 2007 中。
Update更新
This code is written by using the code sample provided by SO member Joop Eggen
此代码使用 SO 成员Joop Eggen
提供的代码示例编写
public Main() throws URISyntaxException, IOException, ParserConfigurationException, SAXException
{
URI docxUri = new URI("C:/Users/Yohan/Desktop/yohan.docx");
Map<String, String> zipProperties = new HashMap<>();
zipProperties.put("encoding", "UTF-8");
FileSystem zipFS = FileSystems.newFileSystem(docxUri, zipProperties);
Path documentXmlPath = zipFS.getPath("/word/document.xml");
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(Files.newInputStream(documentXmlPath));
byte[] content = Files.readAllBytes(documentXmlPath);
String xml = new String(content, StandardCharsets.UTF_8);
//xml = xml.replace("#DATE#", "2014-09-24");
xml = xml.replace("#NAME#", StringEscapeUtils.escapeXml("Sniper"));
content = xml.getBytes(StandardCharsets.UTF_8);
Files.write(documentXmlPath, content);
}
However this returns the below error但是,这会返回以下错误
java.nio.file.ProviderNotFoundException: Provider "C" Not found
at: java.nio.file.FileSystems.newFileSystem(FileSystems.java:341) at java.nio.file.FileSystems.newFileSystem(FileSystems.java:341)
at java.nio.fileFileSystems.newFileSystem(FileSystems.java:276)
One may use for docx (a zip with XML and other files) a java zip file system and XML or text processing. 可以将docx(带有XML和其他文件的zip)用于java zip文件系统以及XML或文本处理。
URI docxUri = ,,, // "jar:file:/C:/... .docx"
Map<String, String> zipProperties = new HashMap<>();
zipProperties.put("encoding", "UTF-8");
try (FileSystem zipFS = FileSystems.newFileSystem(docxUri, zipProperties)) {
Path documentXmlPath = zipFS.getPath("/word/document.xml");
When using XML: 使用XML时:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(Files.newInputStream(documentXmlPath));
//Element root = doc.getDocumentElement();
You can then use XPath to find the places, and write the XML back again. 然后,您可以使用XPath查找位置,然后再次写回XML。
It even might be that you do not need XML but could replace place holders: 甚至可能是您不需要XML,但可以替换占位符:
byte[] content = Files.readAllBytes(documentXmlPath);
String xml = new String(content, StandardCharsets.UTF_8);
xml = xml.replace("#DATE#", "2014-09-24");
xml = xml.replace("#NAME#", StringEscapeUtils.escapeXml("Sniper")));
...
content = xml.getBytes(StandardCharsets.UTF_8);
Files.delete(documentXmlPath);
Files.write(documentXmlPath, content);
For a fast development, rename a copy of the .docx to a name with the .zip file extension, and inspect the files. 为了快速进行开发,请将.docx的副本重命名为带有.zip文件扩展名的名称,然后检查文件。
File.write
should already apply StandardOpenOption.TRUNCATE_EXISTING, but I have added Files.delete
as some error occured. File.write
应该已经应用StandardOpenOption.TRUNCATE_EXISTING,但是由于发生了一些错误,我已经添加了Files.delete
。 See comments. 看评论。
Try Apache POI . 试试Apache POI 。 POI
can work with doc
and docx
, but docx
is more documented therefore support of it better. POI
可以与doc
和docx
,但是docx
的文档更多,因此对它的支持更好。
UPD : You can use XDocReport , which use POI. UPD :可以使用XPOReport ,它可以使用POI。 Also I recomend to use xlsx
for templates because it more suitable and more documented 我也建议对模板使用xlsx
,因为它更合适且文档更xlsx
I have spent a few days on this issue, until I found that what makes the difference is the try-with-resources
on the FileSystem instance, appearing in Joop Eggen's snippet but not in question snippet: 我花了几天时间解决这个问题,直到发现与众不同的是FileSystem实例上的try-with-resources
,出现在Joop Eggen的代码段中,而不是有问题的代码段:
try (FileSystem zipFS = FileSystems.newFileSystem(docxUri, zipProperties))
Without such try-with-resources
block, the FileSystem
resource will not be closed (as explained in Java tutorial ), and the word document not modified. 没有这种try-with-resources
块,将不会关闭FileSystem
资源(如Java教程中所述 ),并且不会修改word文档。
Stepping back a bit, there are about 4 different approaches for editing words/placeholders: 退一步,大约有4种不同的方法来编辑单词/占位符:
Before choosing one, you should decide whether you also need to be able to handle: 选择一个之前,您应该决定是否还需要处理以下内容:
If you need these, then MERGEFIELD or DOCPROPERTY fields are probably out (though you can also use IF fields, if you can find a library which supports them). 如果需要这些,则MERGEFIELD或DOCPROPERTY字段可能会用完(尽管您也可以使用IF字段(如果可以找到支持它们的库))。 And adding images makes DOM/SAX manipulation as advocated in one of the other answers, messier and error prone. 添加图像使DOM / SAX操作如其他答案之一所提倡的那样,更易产生混乱和错误。
The other things to consider are: 要考虑的其他事项是:
Please try this to edit or replace the word in document请尝试此操作来编辑或替换文档中的单词
public class UpdateDocument {
public static void main(String[] args) throws IOException {
UpdateDocument obj = new UpdateDocument();
obj.updateDocument(
"c:\\test\\template.docx",
"c:\\test\\output.docx",
"Piyush");
}
private void updateDocument(String input, String output, String name)
throws IOException {
try (XWPFDocument doc = new XWPFDocument(
Files.newInputStream(Paths.get(input)))
) {
List<XWPFParagraph> xwpfParagraphList = doc.getParagraphs();
//Iterate over paragraph list and check for the replaceable text in each paragraph
for (XWPFParagraph xwpfParagraph : xwpfParagraphList) {
for (XWPFRun xwpfRun : xwpfParagraph.getRuns()) {
String docText = xwpfRun.getText(0);
//replacement and setting position
docText = docText.replace("${name}", name);
xwpfRun.setText(docText, 0);
}
}
// save the docs
try (FileOutputStream out = new FileOutputStream(output)) {
doc.write(out);
}
}
}
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.