简体   繁体   中英

Listing files that match a criteria in R

I have a series of files. Every file name has two numbers. The first number is the generation number and it can be a number between 0 - 250. The next number is the model number which is between 1 - 450.

Some examples:

Generation_Flux_0_Model_10.txt
Generation_Flux_0_Model_5.txt
Generation_Flux_1_Model_20.txt
Generation_Flux_2_Model_17.txt
Generation_Flux_5_Model_9.txt
Generation_Flux_55_Model_5.txt
Generation_Flux_117_Model_2.txt
Generation_Flux_8_Model_23.txt

I want to list files only for a specified set of generations. For example, getting the files for generation 1 and 8 should list only:

Generation_Flux_ 1 _Model_20.txt and Generation_Flux_ 8 _Model_23.txt.

I wrote the following line which only results in a binary value.

reactionFile = list.files(pattern = "\\.txt$")
generations = c(0, 1, 8)
str_extract(reactionFile,"\\d+")%in%generations

[1] TRUE TRUE TRUE FALSE FALSE FALSE FALSE TRUE

  1. Is there a way to specify this criteria in the list.files(pattern="") parameter?
  2. Also which way would be faster to select the only required files? Listing all the files in the directory and getting a subset or loading only the required files using list.files()?

Try this pattern:

list.files(pattern = "^Generation_Flux_[18]_Model_\\d+\\.txt$")

This should match only generations 1 and 8, with any model number.

If you have an arbitrary set of generation numbers, then you may dynamically build an alternation, eg

getGens <- function(v) {
    pat <- paste0("(", paste0(v, collapse="|"), ")")
    return(pat)
}

gens <- c(1, 50, 100, 150)      # or any values you wish to use
pat <- paste0("^Generation_Flux_", getGens(gens), "_Model_\\d+\\.txt$")
list.files(pattern = pat)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM