简体   繁体   English

是否有可以与Java集成的语言,可以有效地读取大型XLSX文件?

[英]Are there any languages that can be integrated with java that read large XLSX files efficiently?

So i'm working on an app that needs to frequently read large XLSX files. 因此,我正在开发一个需要经常读取大型XLSX文件的应用程序。 I'm using Java, and Apache POI keeps running out of memory on certain XLSX files. 我使用的是Java,而Apache POI会在某些XLSX文件上不断耗尽内存。 I know theres a way to XML parse with POI, but it looks pretty messy. 我知道有一种使用POI解析XML的方法,但是看起来很乱。

Resaving these files as another format (XLS, CSV) is not an option because the entire process needs to be automated, and some of these files have multiple sheets or exceed the row count allotted for XLS files. 不能将这些文件另存为另一种格式(XLS,CSV),因为整个过程需要自动化,并且其中一些文件具有多张纸或超过为XLS文件分配的行数。

I've also thought about writing a script to "recreate" the excel files with only the underlying data, but this is not ideal because there are formats that need to be preserved on some files. 我还考虑过编写脚本以仅使用基础数据“重新创建” excel文件,但这并不理想,因为某些文件需要保留某些格式。

Are there any languages that I can call from Java that can read large XLSX files without memory issues? 我可以从Java调用任何语言来读取大型XLSX文件而不会出现内存问题吗?

@Gus, I had the same problem. @Gus,我有同样的问题。 I had to read a 13MB XLSX and ran out of heap with conventional POI. 我必须读取13MB XLSX并用传统的POI耗尽了内存。 I had to implement XSSF+SAX API to read the file. 我必须实现XSSF + SAX API才能读取文件。 Although very difficult to understand at first, I'm able to read my XLSX file easily with it (and very quickly, too). 尽管起初很难理解,但是我可以轻松地(而且很快)读取XLSX文件。

http://poi.apache.org/spreadsheet/how-to.html#xssf_sax_api http://poi.apache.org/spreadsheet/how-to.html#xssf_sax_api

The Apache guys give an example of it's usage in the link. Apache家伙在链接中提供了一个用法示例。 In my case I copied the code and adapted to my needs. 就我而言,我复制了代码并适应了我的需求。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM