[英]Loading multiple files into R > 2GB
I have been trying to upload many files into R using several different methods that have worked with me in the past, but for some reason are not here. 我一直尝试使用过去使用过的几种不同方法将许多文件上传到R中,但是由于某些原因不在这里。 I have read many posts on the forum that address the different ways this can be done but none of them seem to work for my problem;
我已经在论坛上阅读了许多帖子,这些帖子讨论了可以完成此操作的不同方法,但是似乎都没有解决我的问题的方法。 the files are larger I suppose.
我想文件比较大。
Here are the different things I have tried: 这是我尝试过的不同方法:
files <- list.files(pattern = ".txt")
listOfFiles <- list()
for(i in 1:length(files)){
listOfFiles[[i]] <- read.table(files[i], header = TRUE, sep = "\t", stringsAsFactors = FALSE)
}
However, when I run this, my computer just freezes and ceases to work. 但是,当我运行此程序时,我的计算机只是死机而停止工作。 This has led me to believe that it may be a memory issue however, I have tried changing the
memory.limit()
to about 12000 and it still does not run. 这使我相信这可能是内存问题,但是,我尝试将
memory.limit()
更改为大约12000,但仍然无法运行。
There is a posting here that sort of addresses the issue at hand: Quickly reading very large tables as dataframes . 这里有一个帖子那种地址手头上的问题 : 快速阅读非常大的表作为dataframes 。 Reasons why it differs is that I know that the scripts I have uploaded work, just not on many files totaling more than 2GB.
之所以不同,是因为我知道我上载的脚本可以工作,只是在总容量超过2GB的许多文件上没有。 I believe this is a memory issue because, when I ran it again I got the error:
我认为这是一个内存问题,因为当我再次运行它时,出现了错误:
Error: cannot allocate vector of size 7.8 Mb
I have read other posts on the forum that use lapply
, so thought I'd try it out however, it has also failed to work. 我已经阅读了论坛上其他使用
lapply
,因此以为我可以尝试一下,但是它也无法正常工作。
Here is what I did: 这是我所做的:
listo <- lapply(files, read.table)
This on the other hand runs, but when I try and open the list listo
it gives me the error: 另一方面,它运行,但是当我尝试打开列表
listo
,出现了以下错误:
Error: object 'listo' not found
Any help would be much appreciated. 任何帮助将非常感激。
Thank you @TinglTanglBob for your help in solving this question. 感谢@TinglTanglBob对解决此问题的帮助。
Here is the how I solved it: 这是我的解决方法:
memory.limit(size = 12000)
files <- list.files(pattern = ".txt")
YFV_list <- list()
for(i in 1:length(files)){
YFV_list[[i]] <- fread(files[i], sep = "\t", header = TRUE, stringsAsFactors = FALSE)
}
So I'm assuming was a memory issue. 所以我假设是内存问题。 Using
fread
from the Data.table
package helped overcome this problem as it was not working earlier with read.table
. 使用
Data.table
包中的fread
可以Data.table
此问题,因为它无法与read.table
。 However, some tweaking needed to be done to the memory.limit
for this to work regardless. 但是,无论如何,都需要对
memory.limit
进行一些调整。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.