简体   繁体   English

超出R的内存限制(即使使用24GB RAM)

[英]Exceeding memory limit in R (even with 24GB RAM)

I am trying to merge two dataframes: one has 908450 observations of 33 variables, and the other has 908450 observations of 2 variables. 我试图合并两个数据帧:一个有33个变量的908450个观测值,另一个有2个变量的908450个观测值。

dataframe2 <-merge(dataframe1, dataframe2, by="id")

I've cleared all other dataframes from working memory, and reset my memory limit (for a brand new desktop with 24 GB of RAM) using the code: 我已经清除了工作内存中的所有其他数据帧,并使用以下代码重置了我的内存限制(对于具有24 GB RAM的全新桌面):

memory.limit(24576)

But, I'm still getting the error Cannot allocate vector of size 173.Mb . 但是,我仍然得到错误Cannot allocate vector of size 173.Mb

Any thoughts on how to get around this problem? 有关如何解决这个问题的任何想法?

To follow up on my comments, use data.table . 要跟进我的评论,请使用data.table I put together a quick example matching your data to illustrate: 我整理了一个与您的数据匹配的快速示例来说明:

library(data.table)

dt1 <- data.table(id = 1:908450, matrix(rnorm(908450*32), ncol = 32))
dt2 <- data.table(id = 1:908450, rnorm(908450))
#set keys
setkey(dt1, id)
setkey(dt2, id)
#check dims
> dim(dt1)
[1] 908450     33
> dim(dt2)
[1] 908450      2
#merge together and check system time:
> system.time(dt3 <- dt1[dt2])
   user  system elapsed 
   0.43    0.03    0.47 

So it took less than 1/2 second to merge together. 因此合并不到1/2秒。 I took a before and after screenshot watching my memory. 我拍了一张前后截图,看着我的记忆。 Before the merge, I was using 3.4 gigs of ram. 在合并之前,我使用了3.4演出的ram。 When I merged together, it jumped to 3.7 and leveled off. 当我合并在一起时,它跳到了3.7并且趋于平稳。 I think you'll be hard pressed to find something more memory or time efficient than that. 我认为你很难找到更多的记忆或时间效率。

Before: 之前: 在此输入图像描述

After: 后: 在此输入图像描述

As far as I can think of there's three solutions: 据我所知,有三种解决方案:

  • Use datatables 使用数据表
  • Use swap memory ( can be adjustable on *nix machines) 使用交换内存(可在* nix机器上调整)
  • Use sampling 使用抽样

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM