[英]java.lang.OutOfMemoryError: GC overhead limit exceeded when loading an xlsx file
I understand what the error means, that my program is consuming too much memory and for a long period of the time it is not recovering. 我理解该错误是什么意思,即我的程序占用了太多内存,并且很长一段时间都无法恢复。
My program is just reading 6,2Mb xlsx
file when the memory issue occures. 发生内存问题时,我的程序仅读取6,2Mb xlsx
文件。
When I try to monitor the program, it very quickly reaches 1,2Gb in memory consumption and then it crashes. 当我尝试监视该程序时,它很快就会达到1,2Gb的内存消耗,然后崩溃。 How can it reach 1,2Gb when reading 6,2Mb file? 读取6,2Mb文件时如何达到1,2Gb?
Is there a way to open the file in chunks? 有没有一种方法可以分块打开文件? So that it doesn't have to be loaded to the memory? 这样就不必将其加载到内存中了吗? Or any other solution? 或其他解决方案?
Exactly this part causes it. 正是这部分导致了它。 But since it is a library, shouldn't it be handled somehow smartly? 但是由于它是一个库,难道不应该以某种方式巧妙地处理它吗? It is only 200 000 rows with only 3 columns. 它只有20万行,只有3列。 For future, I need it to work with approx. 为了将来,我需要它与大约。 1 mil records and more columns... 100万条记录和更多列...
CODE: 码:
Workbook myWorkBook;
Sheet mySheet;
if (filePath.contains(".xlsx")) {
// Finds the workbook instance for XLSX file
myWorkBook = new XSSFWorkbook(fis);
// Return first sheet from the XLSX workbook
mySheet = myWorkBook.getSheetAt(0);
myWorkBook.close(); // Should I close myWorkBook before I get data from it?
}
If you wish to work with large XLSX files, you need to use the streaming XSSFReader
class. 如果希望使用大型XLSX文件,则需要使用流XSSFReader
类。 Since the data is XML, you can use StAX to effectively process the contents. 由于数据是XML,因此可以使用StAX有效地处理内容。
Here's (one way) how to get the Inputstream
from the xlsx. 这里是(单程)如何获得Inputstream
从XLSX。
OPCPackage opc = OPCPackage.open(file);
XSSFReader xssfReader = new XSSFReader(opc);
SharedStringsTable sst = xssfReader.getSharedStringsTable();
XSSFReader.SheetIterator itr = (XSSFReader.SheetIterator)xssfReader.getSheetsData();
while(itr.hasNext()) {
InputStream sheetStream = itr.next();
if(itr.getSheetName().equals(sheetName)) { // Or you can keep track of sheet numbers
in = sheetStream;
return;
} else {
sheetStream.close();
}
}
The elements are <row>
, and <c>
(for cell). 元素是<row>
和<c>
(用于单元格)。 You can create a small xlsx file, unzip it and examine the XML inside for more information. 您可以创建一个小的xlsx文件,将其解压缩并检查其中的XML以获取更多信息。
Edit: There are some examples on processing the data with SAX, but using StAX is a lot nicer and just as efficient. 编辑:有一些使用SAX处理数据的示例 ,但是使用StAX更好而且同样有效。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.