[英]Java multi-threaded with CompletableFuture works slower
我试图编写代码来计算我计算机上某种类型的文件。 我测试了单线程解决方案和多线程异步解决方案,似乎单线程工作得更快。 我的代码有什么问题吗? 如果没有,为什么它不工作得更快?
下面的代码: AsynchFileCounter - 异步版本。 ExtensionFilter - 仅列出具有指定扩展名的目录和文件的文件过滤器 BasicFileCounter - 单线程版本。
public class AsynchFileCounter {
public int countFiles(String path, String extension) throws InterruptedException, ExecutionException {
ExtensionFilter filter = new ExtensionFilter(extension, true);
File f = new File(path);
return countFilesRecursive(f, filter);
}
private int countFilesRecursive(File f, ExtensionFilter filter) throws InterruptedException, ExecutionException {
return CompletableFuture.supplyAsync(() -> f.listFiles(filter))
.thenApplyAsync(files -> {
int count = 0;
for (File file : files) {
if(file.isFile())
count++;
else
try {
count += countFilesRecursive(file, filter);
} catch (Exception e) {
e.printStackTrace();
}
}
return count;
}).get();
}
}
public class ExtensionFilter implements FileFilter {
private String extension;
private boolean allowDirectories;
public ExtensionFilter(String extension, boolean allowDirectories) {
if(extension.startsWith("."))
extension = extension.substring(1);
this.extension = extension;
this.allowDirectories = allowDirectories;
}
@Override
public boolean accept(File pathname) {
if(pathname.isFile() && pathname.getName().endsWith("." + extension))
return true;
if(allowDirectories) {
if(pathname.isDirectory())
return true;
}
return false;
}
}
public class BasicFileCounter {
public int countFiles(String path, String extension) {
ExtensionFilter filter = new ExtensionFilter(extension, true);
File f = new File(path);
return countFilesRecursive(f, filter);
}
private int countFilesRecursive(File f, ExtensionFilter filter) {
int count = 0;
File [] ar = f.listFiles(filter);
for (File file : ar) {
if(file.isFile())
count++;
else
count += countFilesRecursive(file, filter);
}
return count;
}
}
您必须生成多个异步作业,并且不能立即等待它们完成:
public int countFiles(String path, String extension) {
ExtensionFilter filter = new ExtensionFilter(extension, true);
File f = new File(path);
return countFilesRecursive(f, filter).join();
}
private CompletableFuture<Integer> countFilesRecursive(File f, FileFilter filter) {
return CompletableFuture.supplyAsync(() -> f.listFiles(filter))
.thenCompose(files -> {
if(files == null) return CompletableFuture.completedFuture(0);
int count = 0;
CompletableFuture<Integer> fileCount = new CompletableFuture<>(), all=fileCount;
for (File file : files) {
if(file.isFile())
count++;
else
all = countFilesRecursive(file, filter).thenCombine(all, Integer::sum);
}
fileCount.complete(count);
return all;
});
}
请注意File.listFiles
可能返回null
。
此代码将立即计算目录中的所有文件,但会为子目录启动一个新的异步作业。 子目录作业的结果通过thenCombine
组合,以总结它们的结果。 为了简单fileCount
,我们创建了另一个CompletableFuture
, fileCount
来表示本地计数的文件。 thenCompose
返回一个future,它将以指定函数返回的future 的结果完成,因此调用者可以使用join()
等待整个操作的最终结果。
对于 I/O 操作,使用不同的线程池可能会有所帮助,因为默认ForkJoinPool
配置为利用 CPU 内核而不是 I/O 带宽:
public int countFiles(String path, String extension) {
ExecutorService es = Executors.newFixedThreadPool(30);
ExtensionFilter filter = new ExtensionFilter(extension, true);
File f = new File(path);
int count = countFilesRecursive(f, filter, es).join();
es.shutdown();
return count;
}
private CompletableFuture<Integer> countFilesRecursive(File f,FileFilter filter,Executor e){
return CompletableFuture.supplyAsync(() -> f.listFiles(filter), e)
.thenCompose(files -> {
if(files == null) return CompletableFuture.completedFuture(0);
int count = 0;
CompletableFuture<Integer> fileCount = new CompletableFuture<>(), all=fileCount;
for (File file : files) {
if(file.isFile())
count++;
else
all = countFilesRecursive(file, filter,e).thenCombine(all,Integer::sum);
}
fileCount.complete(count);
return all;
});
}
没有最佳线程数,这取决于实际执行环境,并会受到测量和调整。 当应用程序应该在不同的环境中运行时,这应该是一个可配置的参数。
但请考虑到您可能使用了错误的工具来完成这项工作。 另一种选择是 Fork/Join 任务,它支持与线程池交互以确定当前饱和度,因此一旦所有工作线程都忙,它将使用普通递归进行本地扫描,而不是提交更多异步作业:
public int countFiles(String path, String extension) {
ExtensionFilter filter = new ExtensionFilter(extension, true);
File f = new File(path);
return POOL.invoke(new FileCountTask(f, filter));
}
private static final int TARGET_SURPLUS = 3, TARGET_PARALLELISM = 30;
private static final ForkJoinPool POOL = new ForkJoinPool(TARGET_PARALLELISM);
static final class FileCountTask extends RecursiveTask<Integer> {
private final File path;
private final FileFilter filter;
public FileCountTask(File file, FileFilter ff) {
this.path = file;
this.filter = ff;
}
@Override
protected Integer compute() {
return scan(path, filter);
}
private static int scan(File directory, FileFilter filter) {
File[] fileList = directory.listFiles(filter);
if(fileList == null || fileList.length == 0) return 0;
List<FileCountTask> recursiveTasks = new ArrayList<>();
int count = 0;
for(File file: fileList) {
if(file.isFile()) count++;
else {
if(getSurplusQueuedTaskCount() < TARGET_SURPLUS) {
FileCountTask task = new FileCountTask(file, filter);
recursiveTasks.add(task);
task.fork();
}
else count += scan(file, filter);
}
}
for(int ix = recursiveTasks.size() - 1; ix >= 0; ix--) {
FileCountTask task = recursiveTasks.get(ix);
if(task.tryUnfork()) task.complete(scan(task.path, task.filter));
}
for(FileCountTask task: recursiveTasks) {
count += task.join();
}
return count;
}
}
我想到了。 因为我将这一行的结果相加:
count += countFilesRecursive(file, filter);
并使用 get() 接收结果,我实际上是在等待结果,而不是真正并行化代码。
这是我当前的代码,它实际上比单线程代码运行得快得多。 但是,我无法找到一种优雅的方式来了解并行方法何时完成。
我很想听听我应该如何解决这个问题?
这是我使用的丑陋方式:
public class AsynchFileCounter {
private LongAdder count;
public int countFiles(String path, String extension) {
count = new LongAdder();
ExtensionFilter filter = new ExtensionFilter(extension, true);
File f = new File(path);
countFilesRecursive(f, filter);
// ******** The way I check whether The function is done **************** //
int prev = 0;
int cur = 0;
do {
prev = cur;
try {
Thread.sleep(50);
} catch (InterruptedException e) {}
cur = (int)count.sum();
} while(cur>prev);
// ******************************************************************** //
return count.intValue();
}
private void countFilesRecursive(File f, ExtensionFilter filter) {
CompletableFuture.supplyAsync(() -> f.listFiles(filter))
.thenAcceptAsync(files -> {
for (File file : files) {
if(file.isFile())
count.increment();
else
countFilesRecursive(file, filter);
}
});
}
}
我对代码做了一些改动:
public class AsynchFileCounter {
private AtomicInteger count;
private AtomicInteger countDirectories;
private ReentrantLock lock;
private Condition noMoreDirectories;
public int countFiles(String path, String extension) {
count = new AtomicInteger();
countDirectories = new AtomicInteger();
lock = new ReentrantLock();
noMoreDirectories = lock.newCondition();
ExtensionFilter filter = new ExtensionFilter(extension, true);
File f = new File(path);
countFilesRecursive(f, filter);
lock.lock();
try {
noMoreDirectories.await();
} catch (InterruptedException e) {}
finally {
lock.unlock();
}
return count.intValue();
}
private void countFilesRecursive(File f, ExtensionFilter filter) {
countDirectories.getAndIncrement();
CompletableFuture.supplyAsync(() -> f.listFiles(filter))
.thenAcceptAsync(files -> countFiles(filter, files));
}
private void countFiles(ExtensionFilter filter, File[] files) {
if(files != null) {
for (File file : files) {
if(file.isFile())
count.incrementAndGet();
else
countFilesRecursive(file, filter);
}
}
int currentCount = countDirectories.decrementAndGet();
if(currentCount == 0) {
lock.lock();
try {
noMoreDirectories.signal();
}
finally {
lock.unlock();
}
}
}
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.