将r应用于csv文件的一列

Question

I have a folder with several hundred csv files. 我有一个包含数百个csv文件的文件夹。 I want to use lappply to calculate the mean of one column within each csv file and save that value into a new csv file that would have two columns: Column 1 would be the name of the original file. 我想使用lappply计算每个csv文件中一列的平均值，然后将该值保存到一个新的csv文件中，该文件将包含两列：列1将是原始文件的名称。 Column 2 would be the mean value for the chosen field from the original file. 第2列将是原始文件中所选字段的平均值。 Here's what I have so far: 这是我到目前为止的内容：

setwd("C:/~~~~")
list.files()
filenames <- list.files()
read_csv <- lapply(filenames, read.csv, header = TRUE)
dataset <- lapply(filenames[1], mean)
write.csv(dataset, file = "Expected_Value.csv")

Which gives the error message: 给出错误信息：

Warning message: In mean.default("2pt.csv"[[1L]], ...) : argument is not numeric or logical: returning NA 警告消息：在mean.default（“ 2pt.csv” [[1L]]，...）中：参数不是数字或逻辑：返回NA

So I think I have 2(at least) problems that I cannot figure out. 所以我认为我有2个（至少）无法解决的问题。

First, why doesn't r recognize that column 1 is numeric? 首先，为什么r不能识别第1列是数字？ I double, triple checked the csv files and I'm sure this column is numeric. 我对cv文件进行了两次，三次检查，并且我确定此列是数字。

Second, how do I get the output file to return two columns the way I described above? 其次，如何获取输出文件以上述方式返回两列？ I haven't gotten far with the second part yet. 对于第二部分，我还没有走太远。

I wanted to get the first part to work first. 我想让第一部分首先工作。 Any help is appreciated. 任何帮助表示赞赏。

Answer 1

I didn't use lapply but have done something similar. 我没有用lapply，但做了类似的事情。 Hope this helps! 希望这可以帮助！

    i= 1:2 ##modify as per need

    ##create empty dataframe
    df <- NULL 

    ##list directory from where all files are to be read
    directory <- ("C:/mydir/")

    ##read all file names from directory
    x <- as.character(list.files(directory,,pattern='csv'))
    xpath <- paste(directory, x, sep="")

    ##For loop to read each file and save metric and file name 
    for(i in i) 
    {
    file <- read.csv(xpath[i], header=T, sep=",")
    first_col <- file[,1]
    d<-NULL
   d$mean <- mean(first_col)
   d$filename=x[i]
   df <- rbind(df,d)
    }

   ###write all output to csv
   write.csv(df, file = "C:/mydir/final.csv")

   CSV file looks like below 

    mean        filename
   1999.000661  hist_03082015.csv
   1999.035121  hist_03092015.csv

Answer 2

Thanks for the two answers. 感谢您的两个答案。 After much review, it turns out that there was a much easier way to accomplish my goal. 经过大量的审查，事实证明，有一种更轻松的方法可以实现我的目标。 The csv files that I had were originally in one file. 我原来的csv文件原本在一个文件中。 I split them into multiple files by location. 我将它们按位置分成多个文件。 At the time, I thought this was necessary to calculate mean on each type. 当时，我认为这对于计算每种类型的mean是必要的。 Clearly, that was a mistake. 显然，这是一个错误。 I went to the original file and used aggregate . 我去了原始文件，并使用aggregate 。 Code: 码：

setwd("C:/~~")
allshots <- read.csv("All_Shots.csv", header=TRUE)
EV <- aggregate(allshots$points, list(Location = allshots$Loc), mean)
write.csv(EV, file= "EV_location.csv")

This was a simple solution. 这是一个简单的解决方案。 Thanks again or the answers. 再次感谢或答复。 I'll need to get better at lapply for future projects so they were not a waste of time. 我需要得到更好的在lapply所以他们不会浪费时间为将来的项目。

将r应用于csv文件的一列

问题描述

2 个解决方案

解决方案1
1 2015-03-28 06:53:01

解决方案2
0 2015-03-30 00:36:38

将r应用于csv文件的一列

问题描述

2 个解决方案

解决方案1 1 2015-03-28 06:53:01

解决方案2 0 2015-03-30 00:36:38

解决方案1
1 2015-03-28 06:53:01

解决方案2
0 2015-03-30 00:36:38