简体   繁体   English

使用Apache POI读取Excel 2003文件-FileNotFound

[英]Reading excel 2003 files with apache POI - FileNotFound

I am writing some code to import Excel files to database. 我正在编写一些代码以将Excel文件导入数据库。 The files could be big (thousands of rows), so I am using the Event API. 文件可能很大(数千行),因此我正在使用事件API。 POI version is 3.9 POI版本为3.9

I am opening the file like this: FileInputStream fin = new FileInputStream(file); 我正在这样打开文件:FileInputStream fin = new FileInputStream(file);

//create record listener
HSSFRecordListener mainListener =  new HSSFRecordListener("aaa.xls");
// create a new org.apache.poi.poifs.filesystem.Filesystem
POIFSFileSystem poifs = new POIFSFileSystem(fin);
// get the Workbook (excel part) stream in a InputStream
din = poifs.createDocumentInputStream("Workbook");

Some files are cousing the last line to throw FileNotFoundException. 一些文件使最后一行抛出FileNotFoundException。 Indeed, if I open those files with 7zip, there is no Workbook entry, but there is Book instead. 确实,如果我使用7zip打开这些文件,则没有Workbook条目,但有Book

I have tried to work around this by opening the Book entry if Workbook is not found. 如果没有找到Workbook我试图通过打开“ Book条目来解决此问题。

try {
    din = poifs.createDocumentInputStream("Workbook");
} catch (FileNotFoundException e) {
    try {
        din = poifs.createDocumentInputStream("Book");
    } catch (FileNotFoundException e1) {                    
        FileNotFoundException e2 = new FileNotFoundException("Neither Workbook nor Book found in file!");                    
        e2.initCause(e1);
        throw e2;
    }
}

This results in another exception: 这导致另一个异常:

org.springframework.web.util.NestedServletException: Request processing failed; nested exception is org.apache.poi.hssf.record.RecordFormatException: Unable to construct record instance
    org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:894)
    org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:789)
    javax.servlet.http.HttpServlet.service(HttpServlet.java:647)
    javax.servlet.http.HttpServlet.service(HttpServlet.java:728)
    org.netbeans.modules.web.monitor.server.MonitorFilter.doFilter(MonitorFilter.java:393)
root cause

org.apache.poi.hssf.record.RecordFormatException: Unable to construct record instance
    org.apache.poi.hssf.record.RecordFactory$ReflectionConstructorRecordCreator.create(RecordFactory.java:65)
    org.apache.poi.hssf.record.RecordFactory.createSingleRecord(RecordFactory.java:301)
    org.apache.poi.hssf.record.RecordFactoryInputStream$StreamEncryptionInfo.<init>(RecordFactoryInputStream.java:65)
    org.apache.poi.hssf.record.RecordFactoryInputStream.<init>(RecordFactoryInputStream.java:182)
    org.apache.poi.hssf.eventusermodel.HSSFEventFactory.genericProcessEvents(HSSFEventFactory.java:139)
    org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processEvents(HSSFEventFactory.java:106)
    pl.veracomp.service.SpreadsheetImportService.process(SpreadsheetImportService.java:369)
    pl.veracomp.controller.uploadController.onSubmit(uploadController.java:57)
    org.springframework.web.servlet.mvc.SimpleFormController.processFormSubmission(SimpleFormController.java:272)
    org.springframework.web.servlet.mvc.AbstractFormController.handleRequestInternal(AbstractFormController.java:268)
    org.springframework.web.servlet.mvc.AbstractController.handleRequest(AbstractController.java:153)
    org.springframework.web.servlet.mvc.SimpleControllerHandlerAdapter.handle(SimpleControllerHandlerAdapter.java:48)
    org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:923)
    org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:852)
    org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:882)
    org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:789)
    javax.servlet.http.HttpServlet.service(HttpServlet.java:647)
    javax.servlet.http.HttpServlet.service(HttpServlet.java:728)
    org.netbeans.modules.web.monitor.server.MonitorFilter.doFilter(MonitorFilter.java:393)
root cause

org.apache.poi.hssf.record.RecordFormatException: Not enough data (0) to read requested (2) bytes
    org.apache.poi.hssf.record.RecordInputStream.checkRecordPosition(RecordInputStream.java:216)
    org.apache.poi.hssf.record.RecordInputStream.readShort(RecordInputStream.java:233)
    org.apache.poi.hssf.record.InterfaceHdrRecord.<init>(InterfaceHdrRecord.java:43)
    sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
    sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
    java.lang.reflect.Constructor.newInstance(Constructor.java:513)
    org.apache.poi.hssf.record.RecordFactory$ReflectionConstructorRecordCreator.create(RecordFactory.java:57)
    org.apache.poi.hssf.record.RecordFactory.createSingleRecord(RecordFactory.java:301)
    org.apache.poi.hssf.record.RecordFactoryInputStream$StreamEncryptionInfo.<init>(RecordFactoryInputStream.java:65)
    org.apache.poi.hssf.record.RecordFactoryInputStream.<init>(RecordFactoryInputStream.java:182)
    org.apache.poi.hssf.eventusermodel.HSSFEventFactory.genericProcessEvents(HSSFEventFactory.java:139)
    org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processEvents(HSSFEventFactory.java:106)
    pl.veracomp.service.SpreadsheetImportService.process(SpreadsheetImportService.java:369)
    pl.veracomp.controller.uploadController.onSubmit(uploadController.java:57)
    org.springframework.web.servlet.mvc.SimpleFormController.processFormSubmission(SimpleFormController.java:272)
    org.springframework.web.servlet.mvc.AbstractFormController.handleRequestInternal(AbstractFormController.java:268)
    org.springframework.web.servlet.mvc.AbstractController.handleRequest(AbstractController.java:153)
    org.springframework.web.servlet.mvc.SimpleControllerHandlerAdapter.handle(SimpleControllerHandlerAdapter.java:48)
    org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:923)
    org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:852)
    org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:882)
    org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:789)
    javax.servlet.http.HttpServlet.service(HttpServlet.java:647)
    javax.servlet.http.HttpServlet.service(HttpServlet.java:728)
    org.netbeans.modules.web.monitor.server.MonitorFilter.doFilter(MonitorFilter.java:393)

Google have some info about fixed bugs in POI 3.2 and 3.7 which have something to do with exception Not enough data (0) to read requested (2) bytes , but it seems that it was something else. Google在POI 3.2和3.7中提供了一些有关已修复的错误的信息,这些错误与异常有关。 Not enough data (0) to read requested (2) bytes ,但似乎又是其他问题。

The same files can be opened successfully in Excel 2007. When I save them manually with Save As=>Excel 97/2003 , 7zip shows that the Book entry have been replaced with Workbook and I can successfully import them with Apache POI. 可以在Excel 2007中成功打开相同的文件。当我使用“ Save As=>Excel 97/2003手动保存它们时,7zip显示Book条目已被Workbook替换,并且我可以使用Apache POI成功导入它们。

Did anyone found this issue? 有没有人发现这个问题? How to workaround it? 如何解决呢?

EDIT 编辑

The problem is when I try to open files saved in Microsoft Excel 5.0/95 file format. 问题是当我尝试打开以Microsoft Excel 5.0 / 95文件格式保存的文件时。

To reproduce this issue create new spreadsheet, enter any data, and save as => Microsoft Excel 5.0/95 Workbook (*.xls). 要重现此问题,请创建新的电子表格,输入任何数据,然后另存为=> Microsoft Excel 5.0 / 95工作簿(* .xls)。

Is there any way to read this format with apache POI or do I have to force my users to upgrade their workbooks before uploading? 是否可以通过apache POI读取此格式,还是必须在上载之前强制用户升级其工作簿?

It's a version problem: The file is in an old version. 这是一个版本问题:文件为旧版本。 To confirm this, open your file with a new version of Excel, Modify it, save it and retry. 要确认这一点,请使用新版本的Excel打开文件,对其进行修改,然后保存并重试。

At the time of your question, there wasn't another solution with Apache POI. 在提出问题时,Apache POI没有其他解决方案。 The good news is that there is now! 好消息是现在!

In new versions of Apache POI, if you call HSSFWorkbook or WorkbookFactory with one of these old files, you'll get the more helpful OldExcelFormatException thrown 在新版本的Apache POI中,如果使用这些旧文件之一调用HSSFWorkbookWorkbookFactory则会抛出更为有用的OldExcelFormatException

If you want to get some information out of these files, then OldExcelExtractor is able to fetch text and numbers from formats including Excel 95 (and older!). 如果要从这些文件中获取一些信息,则OldExcelExtractor能够从包括Excel 95(及更低版本!)在内的格式中获取文本和数字。

In order to support that, there's also some Record classes for these, so you could do some event-based parsing to handle them in more detail. 为了支持这一点,还为它们提供了一些Record类,因此您可以进行一些基于事件的解析以更详细地处理它们。 There's no friendly UserModel support though 虽然没有友好的UserModel支持

More generally, from the POI perspective, OpenOffice or LibreOffice may write old MS Office format documents in a better quality than MS Office. 更一般而言,从POI角度来看,OpenOffice或LibreOffice可以以比MS Office更好的质量编写旧的MS Office格式文档。 I worked around this problem when POI failed to read a 97 version .xls file as HSSFWorkbook. 当POI无法读取97版本.xls文件作为HSSFWorkbook时,我解决了此问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM