简体   繁体   English

R循环对多个csv文件执行功能

[英]R loop perform function on multiple csv files

I have tried to create a for loop that does something for each of 4 csv files similar to this but with more files. 我试图创建一个for循环,为4个csv文件中的每个文件执行类似的操作,但具有更多文件。

dat1<- read.csv("female.csv", header =T)
dat2<- read.csv("male.csv", header =T)

for (i in 1:2) {
  message("Female, Male")
  Temp <- dat[i][(dat[i]$NAME == "Temp"), ]
  Temp <- Temp[complete.cases(Temp)]
  print(mean(Temp$MEAN))

However, I get an error: 但是,我得到一个错误:

Error in Temp$MEAN : $ operator is invalid for atomic vectors Temp $ MEAN中的错误:$运算符对原子向量无效

Not sure why this isn't working. 不知道为什么这不起作用。 Any help would be appreciated for looping through csv files! 对于通过csv文件循环的任何帮助将不胜感激!

Personally, I think the easiest way to do this is with the plyr package: 就我个人而言,我认为最简单的方法是使用plyr软件包:

library(plyr)
myFiles <- c("male.csv", "female.csv")
dat <- ldply(myFiles, read.csv)
dat <- dat[complete.cases(dat), ]
mean(dat$MEAN)

The way this works is that you first create a vector of file names. 它的工作方式是首先创建一个文件名向量。 Then the ldply() function performs the function read.csv() on the vector of filenames, and converts the output automatically to a data.frame. 然后,ldply()函数对文件名的向量执行read.csv()函数,并将输出自动转换为data.frame。 Then you do the complete.cases() and mean() in the usual way. 然后,您可以按照通常的方式执行complete.cases()和mean()。

Edit: 编辑:

But if you want the mean of each file then here is one way of doing it: 但是,如果您想要每个文件的均值,那么这是一种处理方法:

# create a vector of files
myFiles <- c("male.csv", "female.csv")  

# create a function that properly handles ONLY ONE ELEMENT
readAndCalc <- function(x){            # pass in the filename
   tmp <- read.csv(x)                  # read the single file
   tmp <- tmp[complete.cases(tmp), ]   # complete.cases()
   mean(tmp$MEAN)                      # mean
}

x <- "male.csv"
readAndCalc(x)                         # test with ONE file

sapply(myFiles, readAndCalc)           # run with all your files

The way this works is that you first create a vector of filenames, just like before. 起作用的方式是,您像以前一样先创建一个文件名向量。 Then you create a function that processes ONLY ONE file at a time. 然后,您创建一个函数,一次只能处理一个文件。 Then you can test that the function works using the readAndCalc function you just created. 然后,您可以使用刚刚创建的readAndCalc函数测试该函数是否正常工作。 Finally do it for all your files with the sapply() function. 最后,使用sapply()函数对所有文件执行此操作。 Hope that helps. 希望能有所帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM