creating a function which extracts a user specified column from a set of files

Question

I have a set of csv files. All of them have same structure. I want to create a function which extracts a particular column from all files. Finds the mean of all the values in that column and store it in a vector. The column name should be passed by user.

I have coded following program. Somehow it can not identify "pollutant" which contains a name of column.

   pollutantmean<-function(pollutant)
{
  file_names<-dir("C:/Users/Keval/Desktop/Project R/R_courseera_programming_exercise/specdata",pattern= glob2rx("*.csv"))

  for(file_name in file_names)
  {
    file_reader<-read.csv(file_name)
    pollutant_data<-file_reader$pollutant
  }
  pollutant_data
  pollutant
}`enter code here`

Answer 1

Use a string, eg, call your function with

pollutantmean(pollutant = "mercury")

and use [ (which accepts strings) instead of $ , which doesn't:

# replace the line
pollutant_data <- file_reader$pollutant
# with this:
pollutant_data <- file_reader[, pollutant]

This won't error out, but you still need to take a mean and store it. I'm also pretty sure you want list.files , not dir .

pollutantmean<-function(pollutant) {
    file_names <- list.files("C:/Users/Keval/Desktop/ProjectR/R_courseera_programming_exercise/specdata",
      pattern= glob2rx("*.csv"))

  # initialize mean vector at correct length
  my_means = numeric(length(file_names)
  # make the loop indexed by number
  for(i in seq_along(file_names)) {
    file_reader <- read.csv(file_names[i])
    pollutant_data <- file_reader[, pollutant]
    # using the number index
    my_means[i] = mean(pollutant_data)
  }
  return(my_means)
}

creating a function which extracts a user specified column from a set of files

Question

1 answers

solution1
1 ACCPTED 2015-03-20 20:42:40

creating a function which extracts a user specified column from a set of files

Question

1 answers

solution1 1 ACCPTED 2015-03-20 20:42:40

solution1
1 ACCPTED 2015-03-20 20:42:40