简体   繁体   English

将多个文件加载到R> 2GB

[英]Loading multiple files into R > 2GB

I have been trying to upload many files into R using several different methods that have worked with me in the past, but for some reason are not here. 我一直尝试使用过去使用过的几种不同方法将许多文件上传到R中,但是由于某些原因不在这里。 I have read many posts on the forum that address the different ways this can be done but none of them seem to work for my problem; 我已经在论坛上阅读了许多帖子,这些帖子讨论了可以完成此操作的不同方法,但是似乎都没有解决我的问题的方法。 the files are larger I suppose. 我想文件比较大。

Here are the different things I have tried: 这是我尝试过的不同方法:

files <- list.files(pattern = ".txt")

listOfFiles <- list()

for(i in 1:length(files)){
 listOfFiles[[i]] <- read.table(files[i], header = TRUE, sep = "\t", stringsAsFactors = FALSE)
}

However, when I run this, my computer just freezes and ceases to work. 但是,当我运行此程序时,我的计算机只是死机而停止工作。 This has led me to believe that it may be a memory issue however, I have tried changing the memory.limit() to about 12000 and it still does not run. 这使我相信这可能是内存问题,但是,我尝试将memory.limit()更改为大约12000,但仍然无法运行。

There is a posting here that sort of addresses the issue at hand: Quickly reading very large tables as dataframes . 这里有一个帖子那种地址手头上的问题快速阅读非常大的表作为dataframes Reasons why it differs is that I know that the scripts I have uploaded work, just not on many files totaling more than 2GB. 之所以不同,是因为我知道我上载的脚本可以工作,只是在总容量超过2GB的许多文件上没有。 I believe this is a memory issue because, when I ran it again I got the error: 我认为这是一个内存问题,因为当我再次运行它时,出现了错误:

Error: cannot allocate vector of size 7.8 Mb 

I have read other posts on the forum that use lapply , so thought I'd try it out however, it has also failed to work. 我已经阅读了论坛上其他使用lapply ,因此以为我可以尝试一下,但是它也无法正常工作。

Here is what I did: 这是我所做的:

listo <- lapply(files, read.table)

This on the other hand runs, but when I try and open the list listo it gives me the error: 另一方面,它运行,但是当我尝试打开列表listo ,出现了以下错误:

Error: object 'listo' not found

Any help would be much appreciated. 任何帮助将非常感激。

Thank you @TinglTanglBob for your help in solving this question. 感谢@TinglTanglBob对解决此问题的帮助。

Here is the how I solved it: 这是我的解决方法:

memory.limit(size = 12000)
files <- list.files(pattern = ".txt")
YFV_list <- list()

for(i in 1:length(files)){
  YFV_list[[i]] <- fread(files[i], sep = "\t", header = TRUE, stringsAsFactors = FALSE)
}

So I'm assuming was a memory issue. 所以我假设是内存问题。 Using fread from the Data.table package helped overcome this problem as it was not working earlier with read.table . 使用Data.table包中的fread可以Data.table此问题,因为它无法与read.table However, some tweaking needed to be done to the memory.limit for this to work regardless. 但是,无论如何,都需要对memory.limit进行一些调整。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM