create dataframes from selected csv files in R

Question

I have a series of csv files like this:

dataframe_1 <- read.csv('C:filepath/data_1.csv', header = T, skip = 1)

Where each file is a year of records. The number varies from run to run, so one time might be only a few, other times dozens of files. What I've been doing is creating individual dataframes, stripping out the columns I want using:

cutout_1 <- dataframe_1[c(1:365), c(2, 4, 6, 8, 10)]

and then binding them with rbind() as follows:

total <- rbind(cutout_1, cutout_2, cutout_3, cutout_4)
as.data.frame(total)

However this is clunky and I need to re-write every time I change something about the model I am using, such as the number of years (and thus the number of files it produces), which wastes a lot of time.

I have tried indexing through the data file, but can't seem to find a way to extract only the files I want, nor find a way to skip the first row, which is essential because of the way the data is produced which I have no control over.

Answer 1

Assuming the working directory is the directory where the files can be found, the code below first gets the filenames, then reads them in a lapply loop and creates a cutout column with the file base name without directory path nor extension. Then rbind s them in one data.frame.

filenames <- list.files(pattern = "Day_Climate_.*\\.csv")

cols_to_keep <- c(2, 4, 6, 8, 10)
rows_to_keep <- 1:365

cutout_list <- lapply(filenames, \(x) {
  dftmp <- read.csv(x, skip = 1L)
  dftmp <- dftmp[rows_to_keep, cols_to_keep]
  # these instructions create in each file 
  # a column telling where they came from
  # (this might not be needed)
  cutout <- basename(x)
  cutout <- tools::file_path_sans_ext(cutout)
  dftmp$cutout <- cutout
  # need to return the anonymous function value
  dftmp
})

total <- do.call(rbind, cutout_list)

create dataframes from selected csv files in R

Question

1 answers

solution1
0 2022-06-14 06:23:08

create dataframes from selected csv files in R

Question

1 answers

solution1 0 2022-06-14 06:23:08

solution1
0 2022-06-14 06:23:08