简体   繁体   中英

R: get list of files but not of directories

In R how can I get a list of files in a folder, but not of the directories?

I have tried using dir() , list.files() , list.dirs() with different options, but none of them seems to work.

setdiff(list.files(), list.dirs(recursive = FALSE, full.names = FALSE))

会做的伎俩。

Here's one possibility:

all.files <- list.files(rec=F)
all.files[!file.info(all.files)$isdir]

Another option (pattern for files with extensions, not so universal, of course):

Sys.glob("*.*")

The fact that base R does not have a direct method to do this is somewhat appalling. The fact that BASH doesn't have a direct way is also a bit odd.

In my opinion, the best R solution is to simply appeal to the shell:

filenames = system('ls -p | grep -v /', intern=T)

Explanation:

ls -p     Append "/" to end of directory names
grep -v   Exclude strings matching "/"
intern=T  store the output in the variable rather then printing to stdout

Another option:

Filter(function(x) file_test("-f", x), list.files())

And if you want to get fully functional with library functional , then you can save a few keystrokes:

Filter(Curry(file_test, "-f"), list.files())

This latter one transforms file_test into a function with the first argument set to "-f", which is basically what we did in the first approach, but Curry does it more cleanly because of the lamentable decision to have the function keyword be so long (why not f(x) {...} ???)

So, I know that these are all old and that there was an accepted answer, but I tried most of them and none really worked.

Here is what I got:

  1. Example of all files in a folder:

     files <- list.files("Training/Out/")
  2. Output of that code:

     [1] "Filtered" "Training_Chr01.txt" "Training_Chr02.txt" "Training_Chr03.txt" [5] "Training_Chr04.txt" "Training_Chr05.txt" "Training_Chr06.txt" "Training_Chr07.txt" [9] "Training_Chr08.txt" "Training_Chr09.txt" "Training_Chr10.txt"

Where the first one [1] is a directory

  1. Ran this code to get only the files:

     files <- list.files("Training/Out",recursive = TRUE)
  2. With this output:

     [1] "Training_Chr01.txt" "Training_Chr02.txt" "Training_Chr03.txt" "Training_Chr04.txt" [5] "Training_Chr05.txt" "Training_Chr06.txt" "Training_Chr07.txt" "Training_Chr08.txt" [9] "Training_Chr09.txt" "Training_Chr10.txt"

This is more or less to help someone who looks at this and was as confused as I was.

I wrote a small wrapper function that tackles precisely this problem:

list_files <- function(path = ".", pattern = NULL, all.files = FALSE, full.names = TRUE, 
                       recursive = FALSE, ignore.case = FALSE, include.dirs = FALSE, no.. = FALSE, 
                       incl_dirs = FALSE){
  
  #Set incl_dirs = TRUE to revert to default list.files() behavior.

  if(path == ".") { path = getwd() }
  
  #Include directories if recursive is set.
  if(incl_dirs & recursive) { include.dirs = TRUE }
  
  #Needs to have full.names = TRUE in order to get full path to pass to dir.exists().
  files <- list.files(path = path, pattern = pattern, all.files = all.files, full.names = TRUE, 
                      recursive = recursive, ignore.case = ignore.case, include.dirs = include.dirs, 
                      no.. = no..)
  
  if(!incl_dirs){
    files <- files[!dir.exists(files)]
  }
  
  if(!full.names){
    return(basename(files))
  } else{
    return(files)
  }
  
}

With the following example directory structure:

dir_test_lvl0/
├── dir_test_lvl1
│   ├── dir_test_lvl2
│   │   ├── dir_test_lvl3
│   │   │   ├── dir_test_lvl4
│   │   │   └── file_test_lvl4
│   │   └── file_test_lvl3
│   └── file_test_lvl2
└── file_test_lvl1

The outputs would look like this, depending on whether incl_dirs and recursive are set (or not).

#No directories presented.
#Recursive.
list_files("dir_test_lvl0", incl_dirs = FALSE, recursive = TRUE, full.names = FALSE)
# [1] "file_test_lvl4" "file_test_lvl3" "file_test_lvl2" "file_test_lvl1"

#No directories presented.
#Non-Recursive.
list_files("dir_test_lvl0", incl_dirs = FALSE, recursive = FALSE, full.names = FALSE)
# [1] "file_test_lvl1"


#With directories presented (default list.files() behavior).
#Non-recursive.
list_files("dir_test_lvl0", incl_dirs = TRUE, recursive = FALSE, full.names = FALSE)
# [1] "dir_test_lvl1"  "file_test_lvl1"


#With directories presented (default list.files() behavior).
#Recursive.
list_files("dir_test_lvl0", incl_dirs = TRUE, recursive = TRUE, full.names = FALSE)
# [1] "dir_test_lvl1"  "dir_test_lvl2"  "dir_test_lvl3"  "dir_test_lvl4"  "file_test_lvl4"
# [6] "file_test_lvl3" "file_test_lvl2" "file_test_lvl1"

All other list.files() options are passed on to it faithfully by list_files() . The function has no external dependencies ( base R only).

If you are willing to try a non-base R package try the fs package

To get just the files in a directory

fs::dir_ls("dir_path", type = "file")

这是使用正则表达式排除没有“.”的列表的另一种解决方案。

list.files("dir_path",pattern="\\\\.")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM