简体   繁体   English

使用 nio.file.DirectoryStream 递归列出目录中的所有文件;

[英]Recursively list all files within a directory using nio.file.DirectoryStream;

I want to list all the FILES within the specified directory and subdirectories within that directory.我想列出指定目录中的所有文件以及该目录中的子目录。 No directories should be listed.不应列出任何目录。

My current code is below.我当前的代码如下。 It does not work properly as it only lists the files and directories within the specified directory.它不能正常工作,因为它只列出指定目录中的文件和目录。

How can I fix this?我怎样才能解决这个问题?

final List<Path> files = new ArrayList<>();

Path path = Paths.get("C:\\Users\\Danny\\Documents\\workspace\\Test\\bin\\SomeFiles");
try
{
  DirectoryStream<Path> stream;
  stream = Files.newDirectoryStream(path);
  for (Path entry : stream)
  {
    files.add(entry);
  }
  stream.close();
}
catch (IOException e)
{
  e.printStackTrace();
}

for (Path entry: files)
{
  System.out.println(entry.toString());
}

Java 8 provides a nice way for that: Java 8 为此提供了一种很好的方法:

Files.walk(path)

This method returns Stream<Path> .此方法返回Stream<Path>

Make a method which will call itself if a next element is directory如果下一个元素是目录,则创建一个将调用自身的方法

void listFiles(Path path) throws IOException {
    try (DirectoryStream<Path> stream = Files.newDirectoryStream(path)) {
        for (Path entry : stream) {
            if (Files.isDirectory(entry)) {
                listFiles(entry);
            }
            files.add(entry);
        }
    }
}

Check FileVisitor , very neat.检查FileVisitor ,非常整洁。

 Path path= Paths.get("C:\\Users\\Danny\\Documents\\workspace\\Test\\bin\\SomeFiles");
 final List<Path> files=new ArrayList<>();
 try {
    Files.walkFileTree(path, new SimpleFileVisitor<Path>(){
     @Override
     public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) throws IOException {
          if(!attrs.isDirectory()){
               files.add(file);
          }
          return FileVisitResult.CONTINUE;
      }
     });
 } catch (IOException e) {
      e.printStackTrace();
 }

If you want to avoid having the function calling itself recursively and having a file list that is a member variable, you can use a stack:如果您想避免让函数递归调用自身并拥有一个作为成员变量的文件列表,您可以使用堆栈:

private List<Path> listFiles(Path path) throws IOException {
    Deque<Path> stack = new ArrayDeque<Path>();
    final List<Path> files = new LinkedList<>();

    stack.push(path);

    while (!stack.isEmpty()) {
        DirectoryStream<Path> stream = Files.newDirectoryStream(stack.pop());
        for (Path entry : stream) {
            if (Files.isDirectory(entry)) {
                stack.push(entry);
            }
            else {
                files.add(entry);
            }
        }
        stream.close();
    }

    return files;
}

Using Rx Java, the requirement can be solved in a number of ways while sticking to usage of DirectoryStream from JDK.使用 Rx Java,在坚持使用 JDK 中的 DirectoryStream 的同时,可以通过多种方式解决该需求。

Following combinations will give you the desired effect, I'd explain them in sequence:以下组合会给你想要的效果,我会按顺序解释它们:

Approach 1 .方法一 A recursive approach using flatMap() and defer() operators使用 flatMap() 和 defer() 运算符的递归方法

Approach 2 .方法二 A recursive approach using flatMap() and fromCallable operators使用 flatMap() 和 fromCallable 运算符的递归方法

Note: If you replace usage of flatMap() with concatMap() , the directory tree navigation will necessarily happen in a depth-first-search (DFS) manner.注意:如果您将flatMap() 的用法替换为concatMap() ,则目录树导航必然会以深度优先搜索 (DFS) 方式进行。 With flatMap(), DFS effect is not guaranteed.使用 flatMap(),不保证 DFS 效果。

Approach 1: Using flatMap() and defer()方法一:使用 flatMap() 和 defer()

   private Observable<Path> recursiveFileSystemNavigation_Using_Defer(Path dir) {
       return Observable.<Path>defer(() -> {
            //
            // try-resource block
            //
            try(DirectoryStream<Path> children = Files.newDirectoryStream(dir))
            {
                //This intermediate storage is required because DirectoryStream can't be navigated more than once.
                List<Path> subfolders = Observable.<Path>fromIterable(children)
                                                        .toList()
                                                        .blockingGet();


                return Observable.<Path>fromIterable(subfolders)
                        /* Line X */    .flatMap(p -> !isFolder(p) ? Observable.<Path> just(p) : recursiveFileSystemNavigation_Using_Defer(p), Runtime.getRuntime().availableProcessors());

                //      /* Line Y */  .concatMap(p -> !isFolder(p) ? Observable.<Path> just(p) : recursiveFileSystemNavigation_Using_Defer(p));

            } catch (IOException e) {
                /*
                 This catch block is required even though DirectoryStream is  Closeable
                 resource. Reason is that .close() call on a DirectoryStream throws a 
                 checked exception.
                */
                return Observable.<Path>empty();
            }
       });
    }

This approach is finding children of given directory and then emitting the children as Observables.这种方法是查找给定目录的子项,然后将子项作为 Observable 发出。 If a child is a file, it will be immediately available to a subscriber else flatMap() on Line X will invoke the method recursively passing each sub-directory as argument.如果子文件是一个文件,它会立即可供订阅者使用,否则X 行上的 flatMap() 将调用该方法,递归地将每个子目录作为参数传递。 For each such subdir, flatmap will internally subscribe to their children all at the same time.对于每个这样的子目录,flatmap 将同时在内部订阅它们的所有子目录。 This is like a chain-reaction which needs to be controlled.这就像一个需要控制的连锁反应。

Therefore use of Runtime.getRuntime().availableProcessors() sets the maximum concurrency level for flatmap() and prevents it from subscribing to all subfolders at the same time.因此,使用Runtime.getRuntime().availableProcessors()设置 flatmap() 的最大并发级别并防止它同时订阅所有子文件夹。 Without setting concurrency level, imagine what will happen when a folder had 1000 children.不设置并发级别,想象一下当一个文件夹有 1000 个子级时会发生什么。

Use of defer() prevents the creation of a DirectoryStream prematurely and ensures it will happen only when a real subscription to find its subfolders is made.使用defer()可以防止过早创建 DirectoryStream,并确保只有在进行真正的订阅以查找其子文件夹时才会发生。

Finally the method returns an Observable < Path > so that a client can subscribe and do something useful with the results as shown below:最后,该方法返回一个Observable <Path>以便客户端可以订阅并对结果执行一些有用的操作,如下所示:

//
// Using the defer() based approach
//
recursiveDirNavigation.recursiveFileSystemNavigation_Using_Defer(startingDir)
                    .subscribeOn(Schedulers.io())
                    .observeOn(Schedulers.from(Executors.newFixedThreadPool(1)))
                    .subscribe(p -> System.out.println(p.toUri()));

Disadvantage of using defer() is that it does not deal with checked exceptions nicely if its argument function is throwing a checked exception.使用 defer() 的缺点是,如果它的参数函数抛出一个受检异常,它就不能很好地处理受检异常。 Therefore even though DirectoryStream (which implements Closeable) was created in a try-resource block, we still had to catch the IOException because the auto closure of a DirectoryStream throws that checked exception.因此,即使DirectoryStream(实现 Closeable)是在 try-resource 块中创建的,我们仍然必须捕获IOException,因为 DirectoryStream 的自动关闭会引发该检查异常。

While using Rx based style, use of catch() blocks for error handling sounds a bit odd because even errors are sent as events in reactive programming.在使用基于 Rx 的风格时,使用 catch() 块进行错误处理听起来有点奇怪,因为偶数错误在反应式编程中作为事件发送。 So why not we use an operator which exposes such errors as events.那么为什么我们不使用一个将此类错误暴露为事件的运算符。

A better alternative named as fromCallable() was added in Rx Java 2.x .Rx Java 2.x 中添加了一个名为fromCallable() 的更好的替代方法。 2nd approach shows the use of it.第二种方法展示了它的使用。

Approach 2. Using flatMap() and fromCallable operators方法 2. 使用 flatMap() 和 fromCallable 操作符

This approach uses fromCallable() operator which takes a Callable as argument.这种方法使用fromCallable()运算符,它将Callable作为参数。 Since we want a recursive approach, the expected result from that callable is an Observable of children of given folder.由于我们需要递归方法,因此该可调用对象的预期结果是给定文件夹的子对象的可观察对象。 Since we want a subscriber to receive results when they are available, we need to return a Observable from this method.由于我们希望订阅者在结果可用时接收结果,因此我们需要从该方法返回一个 Observable。 Since the result of inner callable is an Observable list of children, the net effect is an Observable of Observables.由于内部 callable 的结果是一个 Observable 子项列表,因此净效果是一个 Observables 的 Observable。

   private Observable<Observable<Path>> recursiveFileSystemNavigation_WithoutExplicitCatchBlock_UsingFromCallable(Path dir) {
       /*
        * fromCallable() takes a Callable argument. In this case the callbale's return value itself is 
        * a list of sub-paths therefore the overall return value of this method is Observable<Observable<Path>>
        * 
        * While subscribing the final results, we'd flatten this return value.
        * 
        * Benefit of using fromCallable() is that it elegantly catches the checked exceptions thrown 
        * during the callable's call and exposes that via onError() operator chain if you need. 
        * 
        * Defer() operator does not give that flexibility and you have to explicitly catch and handle appropriately.   
        */
       return Observable.<Observable<Path>> fromCallable(() -> traverse(dir))
                                        .onErrorReturnItem(Observable.<Path>empty());

    }

    private Observable<Path> traverse(Path dir) throws IOException {
        //
        // try-resource block
        //
        try(DirectoryStream<Path> children = Files.newDirectoryStream(dir))
        {
            //This intermediate storage is required because DirectoryStream can't be navigated more than once.
            List<Path> subfolders = Observable.<Path>fromIterable(children)
                                                    .toList()
                                                    .blockingGet();

            return Observable.<Path>fromIterable(subfolders)
                    /* Line X */    .flatMap(p -> ( !isFolder(p) ? Observable.<Path> just(p) : recursiveFileSystemNavigation_WithoutExplicitCatchBlock_UsingFromCallable(p).blockingSingle())
                                             ,Runtime.getRuntime().availableProcessors());

            //      /* Line Y */  .concatMap(p -> ( !isFolder(p) ? Observable.<Path> just(p) : recursiveFileSystemNavigation_WithoutExplicitCatchBlock_UsingFromCallable(p).blockingSingle() ));

        }
    }

A subscriber will then need to flatten the results stream as shown below:然后订阅者需要将结果流展平,如下所示:

//
// Using the fromCallable() based approach
//
recursiveDirNavigation.recursiveFileSystemNavigation_WithoutExplicitCatchBlock_UsingFromCallable(startingDir)
                        .subscribeOn(Schedulers.io())
                        .flatMap(p -> p)
                        .observeOn(Schedulers.from(Executors.newFixedThreadPool(1)))
                        .subscribe(filePath -> System.out.println(filePath.toUri()));

In traverse() method, why is line X using blocking Get在 traverse() 方法中,为什么 X 行使用阻塞 Get

Because the recursive function returns an Observable < Observable >, but flatmap at that line needs an Observable to subscribe to.因为递归函数返回一个Observable<Observable>,但是那一行的flatmap需要一个Observable来订阅。

Line Y in both approaches uses concatMap()两种方法中的 Y 行都使用 concatMap()

Because concatMap() can be comfortably used if we don't want parallelism during innner subscriptions made by flatmap().因为如果我们不希望在 flatmap() 进行的内部订阅期间进行并行,则可以轻松使用 concatMap()。

In both approaches, the implementation of method isFolder looks like below:在这两种方法中,方法isFolder的实现如下所示:

private boolean isFolder(Path p){
    if(p.toFile().isFile()){
        return false;
    }

    return true;
}

Maven coordinates for Java RX 2.0 Java RX 2.0 的 Maven 坐标

<dependency>
    <groupId>io.reactivex.rxjava2</groupId>
    <artifactId>rxjava</artifactId>
    <version>2.0.3</version>
</dependency>

Imports in Java file在 Java 文件中导入

import java.io.IOException;
import java.nio.file.DirectoryStream;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.List;
import java.util.concurrent.Executors;
import io.reactivex.Observable;
import io.reactivex.schedulers.Schedulers;

This is the shortest implementation I came up with:这是我想出的最短的实现:

final List<Path> files = new ArrayList<>();
Path path = Paths.get("C:\\Users\\Danny\\Documents\\workspace\\Test\\bin\\SomeFiles");
try {
    Files.walk(path).forEach(entry -> list.add(entry));
} catch (IOException e) {
    e.printStackTrack();
}

Complete the implementation: It will read every file from subfolder just a quick check完成实现:它会读取子文件夹中的每个文件,只需快速检查

Path configFilePath = FileSystems.getDefault().getPath("C:\\Users\\sharmaat\\Desktop\\issue\\stores");
List<Path> fileWithName = Files.walk(configFilePath)
                .filter(s -> s.toString().endsWith(".java"))
                .map(Path::getFileName)
                .sorted()
                .collect(Collectors.toList());

for (Path name : fileWithName) {
    // printing the name of file in every sub folder
    System.out.println(name);
}

Try this ..it traverses through every folder and print both folder as well as files:-试试这个..它遍历每个文件夹并打印文件夹和文件:-

public static void traverseDir(Path path) {
    try (DirectoryStream<Path> stream = Files.newDirectoryStream(path)) {
        for (Path entry : stream) {
            if (Files.isDirectory(entry)) {
                System.out.println("Sub-Folder Name : " + entry.toString());
                traverseDir(entry);
            } else {
                System.out.println("\tFile Name : " + entry.toString());
            }
        }
    } catch (IOException e) {
        e.printStackTrace();
    }
}

Try : You will get a list of directory and sub-directory path;尝试:您将获得目录和子目录路径的列表; There may be unlimited sub-directory, try to use recursive process.可能有无限的子目录,尽量使用recursive过程。

public class DriectoryFileFilter {
    private List<String> filePathList = new ArrayList<String>();

    public List<String> read(File file) {
        if (file.isFile()) {
            filePathList.add(file.getAbsolutePath());
        } else if (file.isDirectory()) {
            File[] listOfFiles = file.listFiles();
            if (listOfFiles != null) {
                for (int i = 0; i < listOfFiles.length; i++){
                    read(listOfFiles[i]);
                }
            } else {
                System.out.println("[ACCESS DENIED]");
            }
        }
        return filePathList;
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM