How to convert doc or docx into HTML in Java. Using Apache POI, I was able to convert doc to html but unable to convert docx into html? Please show me sample code? This code work with doc but not docx.
HWPFDocumentCore wordDocument = WordToHtmlUtils.loadDoc(stream);
WordToHtmlConverter wordToHtmlConverter = new WordToHtmlConverter(
DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument());
wordToHtmlConverter.processDocument(wordDocument);
Document htmlDocument = wordToHtmlConverter.getDocument();
ByteArrayOutputStream out = new ByteArrayOutputStream();
DOMSource domSource = new DOMSource(htmlDocument);
StreamResult streamResult = new StreamResult(out);
TransformerFactory tf = TransformerFactory.newInstance();
Transformer serializer = tf.newTransformer();
serializer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
serializer.setOutputProperty(OutputKeys.INDENT, "yes");
serializer.setOutputProperty(OutputKeys.METHOD, "html");
serializer.transform(domSource, streamResult);
out.close();
String result = new String(out.toByteArray());
There is no reason why this shouldn't / can't work.
Please review the following:
In short, make sure you're using an up-to-date version of POI, and have all of the required libraries.
(If you need additional assistance, please explain what isn't working. Are you getting compile-time errors? Run-time errors? Unexpected output?)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.