[英]lapply r to one column of a csv file
I have a folder with several hundred csv
files. 我有一个包含数百个csv
文件的文件夹。 I want to use lappply
to calculate the mean of one column within each csv file and save that value into a new csv file that would have two columns: Column 1 would be the name of the original file. 我想使用lappply
计算每个csv文件中一列的平均值,然后将该值保存到一个新的csv文件中,该文件将包含两列:列1将是原始文件的名称。 Column 2 would be the mean value for the chosen field from the original file. 第2列将是原始文件中所选字段的平均值。 Here's what I have so far: 这是我到目前为止的内容:
setwd("C:/~~~~")
list.files()
filenames <- list.files()
read_csv <- lapply(filenames, read.csv, header = TRUE)
dataset <- lapply(filenames[1], mean)
write.csv(dataset, file = "Expected_Value.csv")
Which gives the error message: 给出错误信息:
Warning message: In mean.default("2pt.csv"[[1L]], ...) : argument is not numeric or logical: returning NA 警告消息:在mean.default(“ 2pt.csv” [[1L]],...)中:参数不是数字或逻辑:返回NA
So I think I have 2(at least) problems that I cannot figure out. 所以我认为我有2个(至少)无法解决的问题。
First, why doesn't r recognize that column 1 is numeric? 首先,为什么r不能识别第1列是数字? I double, triple checked the csv files and I'm sure this column is numeric. 我对cv文件进行了两次,三次检查,并且我确定此列是数字。
Second, how do I get the output file to return two columns the way I described above? 其次,如何获取输出文件以上述方式返回两列? I haven't gotten far with the second part yet. 对于第二部分,我还没有走太远。
I wanted to get the first part to work first. 我想让第一部分首先工作。 Any help is appreciated. 任何帮助表示赞赏。
I didn't use lapply but have done something similar. 我没有用lapply,但做了类似的事情。 Hope this helps! 希望这可以帮助!
i= 1:2 ##modify as per need
##create empty dataframe
df <- NULL
##list directory from where all files are to be read
directory <- ("C:/mydir/")
##read all file names from directory
x <- as.character(list.files(directory,,pattern='csv'))
xpath <- paste(directory, x, sep="")
##For loop to read each file and save metric and file name
for(i in i)
{
file <- read.csv(xpath[i], header=T, sep=",")
first_col <- file[,1]
d<-NULL
d$mean <- mean(first_col)
d$filename=x[i]
df <- rbind(df,d)
}
###write all output to csv
write.csv(df, file = "C:/mydir/final.csv")
CSV file looks like below
mean filename
1999.000661 hist_03082015.csv
1999.035121 hist_03092015.csv
Thanks for the two answers. 感谢您的两个答案。 After much review, it turns out that there was a much easier way to accomplish my goal. 经过大量的审查,事实证明,有一种更轻松的方法可以实现我的目标。 The csv
files that I had were originally in one file. 我原来的csv
文件原本在一个文件中。 I split them into multiple files by location. 我将它们按位置分成多个文件。 At the time, I thought this was necessary to calculate mean
on each type. 当时,我认为这对于计算每种类型的mean
是必要的。 Clearly, that was a mistake. 显然,这是一个错误。 I went to the original file and used aggregate
. 我去了原始文件,并使用aggregate
。 Code: 码:
setwd("C:/~~")
allshots <- read.csv("All_Shots.csv", header=TRUE)
EV <- aggregate(allshots$points, list(Location = allshots$Loc), mean)
write.csv(EV, file= "EV_location.csv")
This was a simple solution. 这是一个简单的解决方案。 Thanks again or the answers. 再次感谢或答复。 I'll need to get better at lapply
for future projects so they were not a waste of time. 我需要得到更好的在lapply
所以他们不会浪费时间为将来的项目。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.