简体   繁体   English

使用XLConnect将Excel文件读取到R:耗尽Java内存

[英]Read Excel files to R with XLConnect: run out of Java memory

I am reading an Excel sheet into R with XLConnect. 我正在使用XLConnect将Excel工作表读入R. It works very well. 它工作得很好。 However, if I re-run the command (after changing values in the Excel file, for example), the function runs out of memory. 但是,如果我重新运行该命令(例如,在更改Excel文件中的值之后),该函数将耗尽内存。

The file/sheet I am reading has 18 columns and 363 rows of numeric data. 我正在阅读的文件/表格有18列和363行数字数据。

The error message is 错误消息是

Error: OutOfMemoryError (Java): Java heap space

which appears on the second (identical) run of a readWorksheetFromFile call. 它出现在readWorksheetFromFile调用的第二个(相同的)运行中。 I am trying to produce an MWE by repeatedly running the input call from this example , but the error does not seem to be reproducible with that file. 我试图通过重复运行此示例中的输入调用来生成MWE,但该错误似乎不能与该文件重现。

The Excel file I am using has many interconnected sheets and is about 3 MB. 我使用的Excel文件有许多相互连接的表,大约3 MB。 The sheet that I am reading is also linked to others, but I have set useCachedValues = TRUE . 我正在阅读的表单也链接到其他表,但我已设置useCachedValues = TRUE

It seems to me that, after executing the first call, the Java memory is not cleared. 在我看来,在执行第一次调用之后,Java内存不会被清除。 The second call then attempts to fill more data into memory, which causes the call to fail. 然后第二个调用尝试将更多数据填充到内存中,这会导致调用失败。 Is it possible to force a garbage collection on the Java memory? 是否可以强制Java内存上的垃圾收集? Currently, the only solution is restarting the R session, which is not practical for my clients. 目前,唯一的解决方案是重新启动R会话,这对我的客户来说不实用。

I know that expanding the Java memory might solve this, but that strikes me as a clumsy solution. 我知道扩展Java内存可能会解决这个问题,但这对我来说是一个笨拙的解决方案。 I would prefer to find a way to dump the memory from previous calls. 我宁愿找到一种从以前的调用中转储内存的方法。

I have also tried using the more verbose loadWorkbook and readWorksheet functions. 我也尝试使用更详细的loadWorkbookreadWorksheet函数。 The same error occurs. 发生同样的错误。

Let me know if there is any other useful information you may require! 如果您可能需要任何其他有用的信息,请告诉我们!

You should have a look at 你应该看看

?xlcFreeMemory

and

?xlcMemoryReport

which is also mentioned in the XLConnect package docu if you are having multipe runs and want to clean up in between. 如果您正在进行多重运行并希望在两者之间进行清理,那么在XLConnect包文档中也会提到它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM