简体   繁体   English

R-遍历文件列表

[英]R - iterate over list of files

I have a list of files with some data that I want to read in to R and then iterate over each file for some calculations. 我有一个文件列表,其中包含一些要读取到R中的数据,然后遍历每个文件进行一些计算。 So far I was able to read the files with the following code: 到目前为止,我已经能够使用以下代码读取文件:

METHOD1 方法1

filenames<-list.files(pattern="*.txt")
mynames<-gsub(".txt$", "", filenames)
for (i in 1:length(mynames)) assign(mynames[i], read.table(filenames[i]))

However when I try to apply some function to "names" it just says NULL 但是,当我尝试将某些功能应用于“名称”时,它只是说NULL

lapply(mynames,nrow)

I know that it could be easier to read the files directly into a list 我知道将文件直接读入列表会更容易

METHOD2 方法2

temp<-list.files(pattern="*.txt")
myfiles<-lapply(temp, read.table,skip="#")

and then do lapply to that list lapply(myfiles,nrow) , but this just looses the information about which file produced each list. 然后将lapply应用于该列表lapply(myfiles,nrow) ,但这只会丢失有关哪个文件生成了每个列表的信息。

Is there any way to circumvent this with either method in order to keep tracking the relation list-file? 为了保持跟踪关系列表文件,有没有一种方法可以用任何一种方法来绕开它?

Thanks 谢谢

For method 2 you could easily use seomething like 对于方法2,您可以轻松使用seomething,例如

temp <- list.files(pattern = "*.txt")
myfiles <- lapply(temp, read.table, skip = "#")
names(myfiles) <- temp

In this way the names attribute stores the filenames and you do not clutter your working environment with new variables. 这样, names属性存储文件名,并且您不会因新变量而使工作环境混乱。

So when you want to iterate over the content you can use lapply(myfiles, function(.) nrow(.)) or if you need to iterate over both the filename and the content you could so something like lapply(names(myfiles), function(.) nrow(myfiles[[.]])) 因此,当您要遍历内容时,可以使用lapply(myfiles, function(.) nrow(.))或者如果需要遍历文件名和内容,则可以使用lapply(names(myfiles), function(.) nrow(myfiles[[.]]))

对于第一种方法,请尝试

  sapply(mynames,function(nameoffile){nrow(get(nameoffile))})

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM