[英]how to get a whole content of file using apache-poi?
我嘗試用java api Apache POI讀取文件.docx。 我用:
public static String view(String nameDoc){
String text = null;
try{
XWPFDocument docx = new XWPFDocument(
new FileInputStream(nameDoc));
XWPFWordExtractor we = new XWPFWordExtractor(docx);
text = we.getText();
we.close();
docx.close();
}catch (Exception e){
e.printStackTrace();
}
return text;
}
在這種情況下,我只獲得文件的文本,但我的文件包括文本,表格,圖片...我怎樣才能獲得文件的完整內容?
String contents = "";
try {
System.out.println("Starting the test");
POIFSFileSystem fs = new POIFSFileSystem(new FileInputStream("D:/Resume.doc"));
HWPFDocument doc = new HWPFDocument(fs);
WordExtractor we = new WordExtractor(doc);
OutputStream file = new FileOutputStream(new File("D:/test.pdf"));
PdfWriter parser = PdfWriter.getInstance(doc, file);
parser.parse();
PDDocument pdfDocument = parser.getPDDocument();
PDFTextStripper stripper = new PDFTextStripper();
contents = stripper.getText(pdfDocument);
pdfDocument.close();
} catch (Exception e) {
logger.error(e.getMessage());
}
在contents
您獲得文件的完整內容。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.