简体   繁体   中英

Apache-POI formatted text issue in .doc file

I am using this example in my project. Everything is fine, and replacing text is also good, But in my output text file, which should be "centered", have become left aligned. Input file - .doc, I have feeling that is breaking the formatting of documents, but I'm not sure what the problem is. How to resolve this problem?

public class HWPFTest {
    public static void main(String[] args){
        String filePath = "F:\\Sample.doc";
        POIFSFileSystem fs = null;        
        try {            
            fs = new POIFSFileSystem(new FileInputStream(filePath));            
            HWPFDocument doc = new HWPFDocument(fs);
            doc = replaceText(doc, "$VAR", "MyValue1");
            saveWord(filePath, doc);
        }
        catch(FileNotFoundException e){
            e.printStackTrace();
        }
        catch(IOException e){
            e.printStackTrace();
        }
    }

    private static HWPFDocument replaceText(HWPFDocument doc, String findText, String replaceText){
        Range r1 = doc.getRange(); 

        for (int i = 0; i < r1.numSections(); ++i ) { 
            Section s = r1.getSection(i); 
            for (int x = 0; x < s.numParagraphs(); x++) { 
                Paragraph p = s.getParagraph(x); 
                for (int z = 0; z < p.numCharacterRuns(); z++) { 
                    CharacterRun run = p.getCharacterRun(z); 
                    String text = run.text();
                    if(text.contains(findText)) {
                        run.replaceText(findText, replaceText);
                    } 
                }
            }
        } 
        return doc;
    }

    private static void saveWord(String filePath, HWPFDocument doc) throws FileNotFoundException, IOException{
        FileOutputStream out = null;
        try{
            out = new FileOutputStream(filePath);
            doc.write(out);
        }
        finally{
            out.close();
        }
    }
}

HWPF is not usable for writing .doc files. It may work for very simple file content, but little extras break it. I fear you are out of luck here - if it is an option for you, you might want to use RTF files and work on those. Word should work correctly, if you rename the rtf extension to .doc (if you need the .doc extension).

(I developed a custom and working variant of HWPF for a client and know how difficult things can get there. The standard HWPF library will get in trouble when characters outside the 8-bit encoding exist, when tables are used, when text boxes are used, when graphics are embedded, ... . Some things in .doc files are also different from what is described in the official specification from Microsoft. Making a working HWPF library is "non-trivial" and requires a lot of spec-reading and investigation. If you want to go down the road to fix those bugs, you need to expect at least half a year of development effort.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM