简体   繁体   English

使用Apache POI从文档中获取图像

[英]Get Image from the document using Apache POI

I am using Apache Poi to read images from docx. 我正在使用Apache Poi从docx读取图像。

Here is my code: 这是我的代码:

enter code here

public Image ReadImg(int imageid) throws IOException {
    XWPFDocument doc = new XWPFDocument(new FileInputStream("import.docx"));
    BufferedImage jpg = null;
    List<XWPFPictureData> pic = doc.getAllPictures();
    XWPFPictureData pict = pic.get(imageid);
    String extract = pict.suggestFileExtension();
    byte[] data = pict.getData();
    //try to read image data using javax.imageio.* (JDK 1.4+)
    jpg = ImageIO.read(new ByteArrayInputStream(data));
    return jpg;
}

It reads images properly but not in order wise. 它会正确读取图像,但顺序不正确。

For example, if document contains 例如,如果文档包含

image1.jpeg image2.jpeg image3.jpeg image4.jpeg image5.jpeg image1.jpeg image2.jpeg image3.jpeg image4.jpeg image5.jpeg

It reads 它读

image4 image3 image1 image5 image2 image4 image3 image1 image5 image2

Could you please help me to resolve it? 你能帮我解决吗?

I want to read the images order wise. 我想按顺序阅读图像。

Thanks, Sithik 谢谢,Sithik

public static void extractImages(XWPFDocument docx) {
    try {

        List<XWPFPictureData> piclist = docx.getAllPictures();
        // traverse through the list and write each image to a file
        Iterator<XWPFPictureData> iterator = piclist.iterator();
        int i = 0;
        while (iterator.hasNext()) {
            XWPFPictureData pic = iterator.next();
            byte[] bytepic = pic.getData();
            BufferedImage imag = ImageIO.read(new ByteArrayInputStream(bytepic));
            ImageIO.write(imag, "jpg", new File("D:/imagefromword/" + pic.getFileName()));
            i++;
        }

    } catch (Exception e) {
        System.exit(-1);
    }

}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM