简体   繁体   中英

In R - Reading fixed number of csv files from folder and create dataframe based on selected files

Problem:

I have a folder which has 168 csv files. Each csv has 200 observations (for simplicity, assume only one variable x in it). Each file is a record of observations of each hour and for seven days (ie 24 x 7 = 168 files).

What I want:

Read 24 files (for a single day) and create a dataframe. Then repeat the process for next 24 files. This way, we will end up with 7 dataframes (one for each day) and each dataframe will have 200 x 24 = 4800 observations in it.

What I tried:

setwd('/data/')
temp = list.files(pattern="*.csv")

for(i in seq(from=1, to=168, by=24)){
  data <- temp[i : i+23] %>% 
     lapply(read.csv, skip=1, header=FALSE) %>% 
     bind_rows   
assign ( paste0("df_",i,sep=""), data)
rm(data)
}

Result:

But I failed to get 4800 observations in each df. Instead, it is returning me only 200 obs. in each df. (eg df_1 : 200 obs) What I did wrong?? can someone please help?

Try this one: I tried with 14 csv files with 200 observations where it worked.

files <- list.files(pattern = "*.csv")
segments <- pls::cvsegments(168, k=24, type="consecutive")
newFileList <- lapply(segments, function(f){
    data.table::rbindlist(lapply(files[f], function(x){
        read.csv(x, skip = 1, header = FALSE)
    }))
})

Here's a solution that's simpler than TheRimalaya's with no need for extra packages. You just need to use a nested for loop.

setwd('/data/')
temp = list.files(pattern="*.csv")
for(i in seq(from=1, to=168, by=24)){
    for(j in 0:23){
        # can add other csv options in the read.csv
        hour <- read.csv(temp[i+j],header=FALSE)
        # if first hour of a day, start new dataframe, else combine with previous hour 
        if(j==0) daydf <- hour else daydf <- rbind(daydf, hour) 
    }
assign(paste0("df_",i), daydf)
print(paste("Creating dataframe:",paste0("df_",i)))
}

This will create seven data frames: df_1, df_25, df_49 etc. I haven't tested this but it should work.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM