简体   繁体   English

是否有简单的 Java 逻辑来处理同一目录中预先存在的文件和新创建的文件?

[英]Is there simple Java logic for processing both pre-existing and newly created files in the same directory?

In Java, here is one of several ways to process a "snapshot" of the files in a particular directory:在 Java 中,以下是处理特定目录中文件“快照”的几种方法之一:

String directory = "/path/to/directory";
List<File> fileList = Arrays.asList((new File(directory)).listFiles());
fileList.parallelStream.forEach(file->{
    Path fileAsPath = file.toPath();
    // Assume the process method finishes by deleting the file or moving it to another directory
    process(fileAsPath);
});

And here is one of several ways to process files that are added to the directory:这是处理添加到目录中的文件的几种方法之一:

WatchService watchService = FileSystems.getDefault().newWatchService();
Path directoryAsPath = Paths.get(directory);
WatchKey watchKey = directoryAsPath.register(watchService, ENTRY_CREATE);

while (true) {
    WatchKey key;
    key = watchService.take();

    for (WatchEvent<?> event: key.pollEvents()) {
        WatchEvent.Kind<?> kind = event.kind();
        if (kind == OVERFLOW) {
            continue;
        }

        Path filename = event.context();
        // Again, assume the process method finishes by deleting the file or moving it
        // to another directory
        process(filename);
    }
}

What would be a fairly straightforward approach to process pre-existing files in the directory -- such as when the process starts -- and also process files that are subsequently added?处理目录中预先存在的文件(例如进程启动时)以及处理随后添加的文件的相当简单的方法是什么?

Each file should be processed exactly once.每个文件都应该被处理一次。 In this situation, the order in which files are processed does not matter.在这种情况下,处理文件的顺序无关紧要。

I suppose one straightforward way would be to put the first block of logic in an infinite loop -- just have the listFiles() method take a new snapshot of the directory, perhaps with a brief delay between iterations -= but this seems clunky.我想一种直接的方法是将第一个逻辑块放入无限循环中——只需让 listFiles() 方法获取目录的新快照,可能在迭代之间有一个短暂的延迟 -= 但这看起来很笨拙。 It's possible that files can be on the order of tens of megabytes.文件可能有几十兆字节的数量级。 It would be nice not to have to wait for an entire "snapshot" of files to be fully processed before beginning another "snapshot" of files.在开始另一个文件“快照”之前不必等待整个文件“快照”被完全处理会很好。

Using a database to track the files that have been processed seems overly complicated.使用数据库来跟踪已处理的文件似乎过于复杂。

Thanks!谢谢!

Use 2 directories.使用 2 个目录。

First move existing files out to a temp dir, then copy them back.首先将现有文件移出临时目录,然后将它们复制回来。 These files, and ones created, will all trigger the watch as new files.这些文件和创建的文件都将作为新文件触发监视。

If you're on Linux, you could instead try touch each existing file (untested, but may be enough to trigger the watch).如果您使用的是 Linux,则可以尝试touch每个现有文件(未经测试,但可能足以触发监视)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM