简体   繁体   English

Java:创建大量Callables或将迭代器结果分配给线程?

[英]Java: Create large number of Callables or distribute iterator result to threads?

I've written an application to manipulate images. 我写了一个操作图像的应用程序。 My manipulation code should be applied to all images (up to 1 Million per folder) in a folder. 我的操作代码应该应用于文件夹中的所有图像(每个文件夹最多1百万个)。

So far, for each image in the folder I create a Callable (which is a worker to manipulate the images) and add it to an ArrayList . 到目前为止,对于文件夹中的每个图像,我创建了一个Callable (它是一个操纵图像的工作者)并将其添加到ArrayList Then I use the invokeAll method of a FixedThreadPool to parallelize the work. 然后我使用FixedThreadPoolinvokeAll方法来并行化工作。

However, my question is: Is this good design? 但是,我的问题是: 这个好设计吗? I have some doubts that adding 1 Million elements to the array list first really makes sense. 我有一些疑问,首先在数组列表中添加1百万个元素真的很有意义。 I was thinking of passing an iterator (over the files) to all the threads and to let each thread take the next element and process it (with the problem of blocking of course, unfortunately) - but does that make sense? 我正在考虑将iterator (通过文件)传递给所有线程并让每个线程接受下一个元素并处理它(当然,不幸的是阻塞问题) - 但这有意义吗?

I sounds ok even if it is not necessarily very efficient and does not scale very well. 我听起来不错,即使它不一定非常有效并且不能很好地扩展。 An alternative design could be: 另一种设计可能是:

  • create an ArrayBlockingQueue<File> bigger in size than your FixedThreadPool (say twice as big) 创建一个比FixedThreadPool更大的ArrayBlockingQueue<File> (比如说大两倍)
  • create a FileVisitor , let's call it ImageFileVisitor , which in the visitFile method puts the visited file in the queue - that is a blocking call so it will wait until the queue is not full 创建FileVisitor ,我们称之为ImageFileVisitor ,这在visitFile方法puts了访问文件中的队列-这是一个阻塞调用,因此将等到队列不满
  • create as many Callable s as the size of your pool and make each of them take from the queue and do what they have to do 创造尽可能多Callable S作为您的池的大小,使他们每个人的take从队列中,做他们必须做的事

Note: the size of the Thread pool should be fairly small. 注意:线程池的大小应该相当小。 If your image processing is very heavy, use the number of processors for the size, if it is somewhat trivial and most of the time is spent reading/writing files, use a smaller size. 如果您的图像处理非常繁重,请使用大小的处理器数量,如果它有点微不足道并且大部分时间花在读取/写入文件上,请使用较小的大小。

FixedThreadPool uses LinkedBlockingQueue of Integer.MAX_VALUE : FixedThreadPool使用LinkedBlockingQueueInteger.MAX_VALUE

public static ExecutorService newFixedThreadPool(int nThreads) {
        return new ThreadPoolExecutor(nThreads, nThreads,
                                      0L, TimeUnit.MILLISECONDS,
                                      new LinkedBlockingQueue<Runnable>());
    }

So, its affectively non-blocking, as in you would be able to offer / put million Runnable instances to it, surely this is unnessary usage of memory for holding millions of objects though your fixedPoolSize would be comparatively much much smaller say 5/10. 所以,它的情感非阻塞,就像你能够offer / put百万个Runnable实例一样,当然,这是为了保存数百万个对象而不记得使用内存,尽管你的fixedPoolSize相对要小得多,比如5/10。

One approach which would directly improve this scenario is to use FixedThreadPool with a finite queue size: 直接改进这种情况的一种方法是使用具有有限队列大小的FixedThreadPool

int nThreads = 10;
int maxQSize = 1000;
ExecutorService service = new ThreadPoolExecutor(nThreads, nThreads,
                                          0L, TimeUnit.MILLISECONDS,
                                          new LinkedBlockingQueue<Runnable>(maxQSize))

With above apporach your put call will block on 1000 runnables in Q , but as soon as some of them finishes, put will continue. 通过以上apporach,您的put期权将阻止Q 1000 runnables,但只要其中一些完成, put将继续。 By doing invokeAll , there would be 10 running threads and max 1000 runnable instances. 通过执行invokeAll ,将有10个正在运行的线程和最多1000个可运行的实例。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM