简体   繁体   English

R的GC和内存限制问题

[英]GC and memory limit issues with R

I am using R on some relatively big data and am hitting some memory issues. 我在一些相对较大的数据上使用R并且遇到了一些内存问题。 This is on Linux. 这是在Linux上。 I have significantly less data than the available memory on the system so it's an issue of managing transient allocation. 我的数据明显少于系统上的可用内存,因此这是管理瞬态分配的问题。

When I run gc(), I get the following listing 当我运行gc()时,我得到以下列表

           used   (Mb) gc trigger   (Mb)  max used   (Mb)
Ncells   2147186  114.7    3215540  171.8   2945794  157.4
Vcells 251427223 1918.3  592488509 4520.4 592482377 4520.3

yet R appears to have 4gb allocated in resident memory and 2gb in swap. 但R似乎在驻留内存中分配了4GB,在交换中分配了2GB。 I'm assuming this is OS-allocated memory that R's memory management system will allocate and GC as needed. 我假设这是操作系统分配的内存,R的内存管理系统将根据需要分配和GC。 However, lets say that I don't want to let R OS-allocate more than 4gb, to prevent swap thrashing. 但是,让我说我不想让R OS分配超过4gb,以防止交换抖动。 I could always ulimit, but then it would just crash instead of working within the reduced space and GCing more often. 我总是可以使用ulimit,但之后它会崩溃而不是在缩小的空间内工作并且更频繁地使用GCing。 Is there a way to specify an arbitrary maximum for the gc trigger and make sure that R never os-allocates more? 有没有办法为gc触发器指定任意最大值并确保R从不进行os分配更多? Or is there something else I could do to manage memory usage? 或者我还能做些什么来管理内存使用?

In short: no. 简而言之:没有。 I found that you simply cannot micromanage memory management and gc() . 我发现你根本无法微观管理内存管理和gc()

On the other hand, you could try to keep your data in memory, but 'outside' of R. The bigmemory makes that fairly easy. 另一方面,您可以尝试将数据保存在内存中,但是在R的“外部”。大记忆使得相当容易。 Of course, using a 64bit version of R and ample ram may make the problem go away too. 当然,使用64位版本的R和充足的RAM可能会使问题消失。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM