简体   繁体   English

R 错误:java.lang.OutOfMemoryError:Java 堆空间

[英]R Error: java.lang.OutOfMemoryError: Java heap space

I am trying to connect R to Teradata to pull data directly into R for analysis.我正在尝试将 R 连接到 Teradata,以将数据直接提取到 R 中进行分析。 However, I am getting the error of,但是,我得到的错误是,

Error in .jcall(rp, "I", "fetch", stride, block) :
  java.lang.OutOfMemoryError: Java heap space

I have tried to set up my R options to increase the max heap size of JVM by doing:我试图通过以下方式设置我的 R 选项以增加 JVM 的最大堆大小:

options(java.parameters = "-Xmx8g")

I have also tried to initialize java parameters with rJava function .jinit as: .jinit(parameters="-Xmx8g") .我还尝试使用 rJava 函数.jinit将 java 参数初始化为: .jinit(parameters="-Xmx8g") But still failed.但还是失败了。

The calculated size of the data should be approximately 3G (actually less than 3G).计算出来的数据大小应该是3G左右(实际小于3G)。

You need to make sure you're allocating additional memory before loading rJava or any other packages.您需要确保在加载 rJava 或任何其他包之前分配额外的内存。 Wipe the environment first (via rm(list = ls()) ), restart R/Rstudio if you must, and modify the options at the beginning of your script.首先擦除环境(通过rm(list = ls()) ),如果必须重新启动 R/Rstudio,然后修改脚本开头的选项。

options(java.parameters = "-Xmx8000m")

See for example https://support.snowflake.net/s/article/solution-using-r-the-following-error-is-returned-javalangoutofmemoryerror-gc-overhead-limit-exceeded参见例如https://support.snowflake.net/s/article/solution-using-r-the-following-error-is-returned-javalangoutofmemoryerror-gc-overhead-limit-exceeded

I somehow had this problem in a not reproducible manner, partly solved it with -Xmx8g but run in to problems randomly.我以某种不可重现的方式遇到了这个问题,使用-Xmx8g部分解决了它,但随机遇到了问题。

I now found an option with a different garbage collector by using我现在通过使用找到了一个带有不同垃圾收集器的选项

options(java.parameters = c("-XX:+UseConcMarkSweepGC", "-Xmx8192m"))
library(xlsx)

at the beginning of the script and before any other package is loaded since other packages can load some java things by themselves and the options have to be set before any Java is loaded.在脚本的开头和加载任何其他包之前,因为其他包可以自己加载一些 java 东西,并且必须在加载任何 Java 之前设置选项。

So far, the problem didn't occurred again.至此,问题没有再发生。

Only sometimes in a long session it can still happen.只是有时在长时间的会话中它仍然会发生。 But in this case a session restart normally solves the problem.但在这种情况下,会话重启通常可以解决问题。

Running the following two lines of code (before any packages are loaded) worked for me on a Mac:在 Mac 上运行以下两行代码(在加载任何包之前)对我有用:

options(java.parameters = c("-XX:+UseConcMarkSweepGC", "-Xmx8192m"))
gc()

This essentially combines two proposals previously posted herein: Importantly, only running the first line alone (as suggested by drmariod) did not solve the problem in my case.这实质上结合了之前在此发布的两个建议:重要的是,仅单独运行第一行(如 drmariod 所建议的那样)并没有解决我的问题。 However, when I was additionally executing gc() just after the first line (as suggested by user2961057) the problem was solved.但是,当我在第一行之后额外执行gc()时(如 user2961057 所建议的),问题就解决了。

Should it still not work, restart your R session, and then try (before any packages are loaded) instead options(java.parameters = "-Xmx8g") and directly after that execute gc() .如果它仍然不起作用,请重新启动 R 会话,然后尝试(在加载任何包之前)而不是options(java.parameters = "-Xmx8g") ,然后直接执行gc() Alternatively, try to further increase the RAM from "-Xmx8g" to eg "-Xmx16g" (provided that you have at least as much RAM).或者,尝试将 RAM 从"-Xmx8g"进一步增加到例如"-Xmx16g" (前提是您至少有同样多的 RAM)。

EDIT: Further solutions: While I had to use the rJava for model estimations in R (explaining y from a large number of X ), I kept receiving the above 'OutOfMemory' Errors even if I scaled up to "-Xmx60000m" (the machine I am using has 64 GB RAM).编辑:进一步的解决方案:虽然我不得不在 R 中使用 rJava 进行模型估计(从大量的X中解释y ),但即使我扩大到"-Xmx60000m" (机器我使用的是 64 GB 内存)。 The problem was that some model specifications were simply too big (and would have required even more RAM).问题是某些模型规格实在太大(并且需要更多 RAM)。 One solution that may help in this case is scaling the size of the problem down (eg by reducing the number of X's in the model), or – if possible – splitting the problem into independent pieces, estimating each separately, and putting those pieces together again.在这种情况下可能有帮助的一种解决方案是缩小问题的规模(例如,通过减少模型中 X 的数量),或者——如果可能的话——将问题分成独立的部分,分别估计每个部分,然后将这些部分放在一起再次。

I added garbage collection and that solved the issue for me.我添加了垃圾收集,这为我解决了这个问题。 I am connecting to Oracle databases using RJDBC.我正在使用 RJDBC 连接到 Oracle 数据库。
simply add gc()只需添加 gc()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM