简体   繁体   中英

How to get pictures with names from an xls file using Apache POI

Using workbook.getAllPictures() I can get an array of picture data but unfortunately it is only the data and those objects have no methods for accessing the name of the picture or any other related information.

There is a HSSFPicture class which would contain all the details of the picture but how to get for example an array of those objects from the xls?

Update:

Found SO question How can I find a cell, which contain a picture in apache poi which has a method for looping through all the pictures in the worksheet. That works.

Now that I was able to try the HSSFPicture class I found out that the getFileName() method is returning the file name without the extension. I can use the getPictureData().suggestFileExtension() to get a suggested file extension but I really would need to get the extension the picture had when it was added into the xls file. Would there be a way to get it?

Update 2:

The pictures are added into the xls with a macro. This is the part of macro that is adding the images into the sheet. fname is the full path and imageName is the file name, both are including the extension.

Set img = Sheets("Receipt images").Pictures.Insert(fname)
img.Left = 10
img.top = top + 10
img.Name = imageName
Set img = Nothing

The routine to check if the picture already exists in the Excel file.

For Each img In Sheets("Receipt images").Shapes
    If img.Name = imageName Then
        Set foundImage = img
        Exit For
    End If
Next

This recognizes that "image.jpg" is different from "image.gif", so the img.Name includes the extension.

The shape names are not in the default POI objects. So if we need them we have to deal with the underlying objects. That is for the shapes in HSSF mainly the EscherAggregate ( http://poi.apache.org/apidocs/org/apache/poi/hssf/record/EscherAggregate.html ) which we can get from the sheet. From its parent class AbstractEscherHolderRecord we can get all EscherOptRecords which contains the options of the shapes. In those options are also to find the groupshape.shapenames.

My example is not the complete solution. It is only provided to show which objects could be used to achieve this.

Example:

import org.apache.poi.hssf.usermodel.*;
import org.apache.poi.ss.usermodel.*;

import java.io.FileOutputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.FileInputStream;
import java.io.InputStream;

import org.apache.poi.openxml4j.exceptions.InvalidFormatException;

import org.apache.poi.hssf.record.*;
import org.apache.poi.ddf.*;

import java.util.List;
import java.util.ArrayList;

class ShapeNameTestHSSF {

 public static void main(String[] args) {
  try {

   InputStream inp = new FileInputStream("workbook1.xls");
   Workbook wb = WorkbookFactory.create(inp);

   Sheet sheet = wb.getSheetAt(0);

   EscherAggregate escherAggregate = ((HSSFSheet)sheet).getDrawingEscherAggregate();

   EscherContainerRecord escherContainer = escherAggregate.getEscherContainer().getChildContainers().get(0); 
   //throws java.lang.NullPointerException if no Container present

   List<EscherRecord> escherOptRecords = new ArrayList<EscherRecord>();

   escherContainer.getRecordsById(EscherOptRecord.RECORD_ID, escherOptRecords);

   for (EscherRecord escherOptRecord : escherOptRecords) {
    for (EscherProperty escherProperty : ((EscherOptRecord)escherOptRecord).getEscherProperties()) {
     System.out.println(escherProperty.getName());
     if (escherProperty.isComplex()) {
      System.out.println(new String(((EscherComplexProperty)escherProperty).getComplexData(), "UTF-16LE"));
     } else {
      if (escherProperty.isBlipId()) System.out.print("BlipId = ImageId = ");
      System.out.println(((EscherSimpleProperty)escherProperty).getPropertyValue());
     }
     System.out.println("=============================");
    }
    System.out.println(":::::::::::::::::::::::::::::");
   }


   FileOutputStream fileOut = new FileOutputStream("workbook1.xls");
   wb.write(fileOut);
   fileOut.flush();
   fileOut.close();

  } catch (InvalidFormatException ifex) {
  } catch (FileNotFoundException fnfex) {
  } catch (IOException ioex) {
  }
 }
}

Again: This is not a ready to use solution. A ready to use solution cannot be provided here, because of the complexity of the EscherRecords. Maybe to get the correct EscherRecords for the image shapes and their related EscherOptRecords, you have recursive to loop through all EscherRecords in the EscherAggregate checking whether they are ContainerRecords and if so loop through its children and so on.

Start here:

http://poi.apache.org/spreadsheet/quick-guide.html#Images

this tutorial can help you to extract an image's information from an xls spreadsheet using Apache POI

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM