简体   繁体   中英

Making word parse HTML formatting

We've got a JSP that utilizes the NicEdit online texteditor to format text using JavaScript. The "submit" button runs a servlet that uploads the message string to our MySQL database, following which it prints the string on paper using the following code:

POIFSFileSystem fs = new POIFSFileSystem();
DirectoryEntry directory = fs.getRoot();
directory.createDocument("WordDocument", new ByteArrayInputStream(content.getBytes()));
                        FileOutputStream out = new FileOutputStream(filename);
fs.writeFilesystem(out);
out.close();

Desktop.getDesktop().print(destinationFile);

My question is, how do we keep the formatting on the printed page (bold, italic etc.) instead of it printing

< b>, < i>, < u> tags?

I must admit I haven't done much pre-research, because I don't really know what to look for?

Thanks a lot,

JAMM

First, there are a bunch of formats you could submit to Word: doc, docx, rtf, html, Word 2003 XML, Flat OPC XML ...

This answer is specific to docx (though it looks like you might be sending .doc - not sure whether you are committed to that), where there are two ways you can handle HTML.

The first is to create an altChunk/alternative format input part containing the HTML, which Word can process when the docx is first opened.

The second is to convert the HTML yourself. As from 2.8.0, docx4j (to which I'm a committer) can convert XHTML to docx content.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM