I have a series of csv files like this:
dataframe_1 <- read.csv('C:filepath/data_1.csv', header = T, skip = 1)
Where each file is a year of records. The number varies from run to run, so one time might be only a few, other times dozens of files. What I've been doing is creating individual dataframes, stripping out the columns I want using:
cutout_1 <- dataframe_1[c(1:365), c(2, 4, 6, 8, 10)]
and then binding them with rbind() as follows:
total <- rbind(cutout_1, cutout_2, cutout_3, cutout_4)
as.data.frame(total)
However this is clunky and I need to re-write every time I change something about the model I am using, such as the number of years (and thus the number of files it produces), which wastes a lot of time.
I have tried indexing through the data file, but can't seem to find a way to extract only the files I want, nor find a way to skip the first row, which is essential because of the way the data is produced which I have no control over.
Assuming the working directory is the directory where the files can be found, the code below first gets the filenames, then reads them in a lapply
loop and creates a cutout
column with the file base name without directory path nor extension. Then rbind
s them in one data.frame.
filenames <- list.files(pattern = "Day_Climate_.*\\.csv")
cols_to_keep <- c(2, 4, 6, 8, 10)
rows_to_keep <- 1:365
cutout_list <- lapply(filenames, \(x) {
dftmp <- read.csv(x, skip = 1L)
dftmp <- dftmp[rows_to_keep, cols_to_keep]
# these instructions create in each file
# a column telling where they came from
# (this might not be needed)
cutout <- basename(x)
cutout <- tools::file_path_sans_ext(cutout)
dftmp$cutout <- cutout
# need to return the anonymous function value
dftmp
})
total <- do.call(rbind, cutout_list)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.