简体   繁体   中英

Apache POI HSSF XLS reading error

Using the following code while reading in a .xls file, where s is the file directory:

InputStream input = new FileInputStream(s);
Workbook wbs = new HSSFWorkbook(input);

I get the following error message:

Exception in thread "main" java.io.IOException: Invalid header signature; read 0x0010000000060809, expected 0xE11AB1A1E011CFD0

I need a program that is able to read in either XLSX or XLS, and using the exact same code just adjusted for XSSF it has no problem at all reading in the XLSX file.

If the file is in xlsx format instead of xls you might get this error. I would try using the generic Workbook object (Also called the SS Usermodel)

Check out the Workbook interface and the WorkbookFactory object . The factory should be able to create a generic Workbook for you out of either xlsx or xls.

I thought I had a good tutorial on this, but I can't seem to find it. I'll keep looking though.

Edit

I found this little tiny snippet from Apache's site about reading and rewriting using the SS Usermodel.

I hope this helps!

Invalid header signature; read 0x342E312D46445025, expected 0xE11AB1A1E011CFD0

Well I got this error when I uploaded corrupted xls/xlsx file(to upload corrupt file I renamed sample.pdf to sample.xls). Add validation like :

Workbook wbs = null;
try {
    InputStream input = new FileInputStream(s);
    wbs = new HSSFWorkbook(input);
} catch(IOException e) {
    // log "file is corrupted", show error message to user
}

The Exception you're getting is one telling you that the file you're supplying isn't a valid Excel binary file, at least not a valid Excel file produced since about 1990. The exception you're getting tells you what POI expects, and that it found something else instead which wasn't a valid .xls file, and wasn't anything else POI can detect.

One thing to be aware of is that Excel opens a wide variety of different file formats, including .csv and .html. It's also not very picky about the file extension, so will happily open a CSV file that has been renamed to a .xls one. However, since renaming a .csv to a .xls doesn't magically change the format, POI still can't open it!

.

From the exception, I can tell what's happening, and I can also tell you're using an ancient version of Apache POI! A header signature of 0x0010000000060809 corresponds to the Excel 4 file format, from about 25 years ago! If you use a more recent version of Apache POI, it'll give you a helpful error message telling you that the file supplied is an old and largely unsupported Excel file. New versions of POI do include the OldExcelExtractor tool which can pull out some information from those ancient formats.

Otherwise, as with all exceptions of this type, try opening the file in Excel and doing a save-as. That will give you an idea of what the file currently is (eg .html saved as .xls, .csv saved as .xls etc), and will also let you re-save it as a proper .xls file for POI to load and work with.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM