简体   繁体   中英

Execute for loop multiple times in R

I have around 631 directories with 20 files in each directory.I want to execute below code for all directories so that it can update dir_1 to dir_2 till dir_631 on every iteration. I have tried double for loop but I wasn't able to make it. Thanks in advance.

library(TeachingDemos)

txtStart("command_split_1000/dir_1.txt")
files <- list.files(path="data/split_1000/dir_1", pattern="x*", full.names=TRUE)

total.coefs <- data.frame()

for (x in files) {
  message('Running: ', x)

  output <- tryCatch({
    ulfasQTL::findSqtl(x, geneObjectName = "gene_after_preprocess", snpFileCate = 1)
  }, print(x), error=function(e) {
    cat("ERROR :", conditionMessage(e), "\n")
  })

  total.coefs <- rbind(total.coefs, output)
  write.table(total.coefs, file = 'output_split_1000/dir_1', sep='\t')

}

txtStop()

Consider nesting a list.files loop inside a list.dirs loop. Also, avoid using rbind inside a loop as it leads to excessive copying in memory (see Patrick Burns' R Interno : Circle 2 - Growing Objects). Instead use lapply to build a list of data frames for a rbind outside of looping.

# RETRIEVE ALL NEEDED DIRECORIES
dirs <- list.dirs(path="data/split_1000")

for (d in dirs) {
  txtStart(paste0("command_split_1000/", basename(d), ".txt"))

  # RETRIEVE ALL FILES IN CURRENT DIRECTORY
  message('Directory: ', d)
  files <- list.files(path=d, pattern="x*", full.names=TRUE)

  # LIST OF DATA FRAMES
  df_list <- lapply(files, function(x) {
      message('--Running: ', x)

      output <- tryCatch({
         ulfasQTL::findSqtl(x, geneObjectName = "gene_after_preprocess", snpFileCate = 1)
      }, print(x), error=function(e) {
         cat("ERROR :", conditionMessage(e), "\n")
      })
  })

  # ROW BIND ALL NON-NULL DF ELEMENTS
  df_list <- Filter(NROW, df_list) 
  total.coefs <- do.call(rbind, df_list)

  # SAVE OUTPUT WITH BASE NAME OF CURRENT DIRECTORY
  out_path <- paste0('output_split_1000/', basename(d), '.txt')
  write.table(total.coefs, file = out_path, sep='\t')

  txtStop()

  # RELEASE RESOURCES
  rm(df_list, files, total.coefs)
  gc()
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM