[英]Java: Create large number of Callables or distribute iterator result to threads?
I've written an application to manipulate images. 我写了一个操作图像的应用程序。 My manipulation code should be applied to all images (up to 1 Million per folder) in a folder.
我的操作代码应该应用于文件夹中的所有图像(每个文件夹最多1百万个)。
So far, for each image in the folder I create a Callable
(which is a worker to manipulate the images) and add it to an ArrayList
. 到目前为止,对于文件夹中的每个图像,我创建了一个
Callable
(它是一个操纵图像的工作者)并将其添加到ArrayList
。 Then I use the invokeAll
method of a FixedThreadPool
to parallelize the work. 然后我使用
FixedThreadPool
的invokeAll
方法来并行化工作。
However, my question is: Is this good design? 但是,我的问题是: 这个好设计吗? I have some doubts that adding 1 Million elements to the array list first really makes sense.
我有一些疑问,首先在数组列表中添加1百万个元素真的很有意义。 I was thinking of passing an
iterator
(over the files) to all the threads and to let each thread take the next element and process it (with the problem of blocking of course, unfortunately) - but does that make sense? 我正在考虑将
iterator
(通过文件)传递给所有线程并让每个线程接受下一个元素并处理它(当然,不幸的是阻塞问题) - 但这有意义吗?
I sounds ok even if it is not necessarily very efficient and does not scale very well. 我听起来不错,即使它不一定非常有效并且不能很好地扩展。 An alternative design could be:
另一种设计可能是:
ArrayBlockingQueue<File>
bigger in size than your FixedThreadPool (say twice as big) ArrayBlockingQueue<File>
(比如说大两倍) FileVisitor
, let's call it ImageFileVisitor
, which in the visitFile
method puts
the visited file in the queue - that is a blocking call so it will wait until the queue is not full FileVisitor
,我们称之为ImageFileVisitor
,这在visitFile
方法puts
了访问文件中的队列-这是一个阻塞调用,因此将等到队列不满 Callable
s as the size of your pool and make each of them take
from the queue and do what they have to do Callable
S作为您的池的大小,使他们每个人的take
从队列中,做他们必须做的事 Note: the size of the Thread pool should be fairly small. 注意:线程池的大小应该相当小。 If your image processing is very heavy, use the number of processors for the size, if it is somewhat trivial and most of the time is spent reading/writing files, use a smaller size.
如果您的图像处理非常繁重,请使用大小的处理器数量,如果它有点微不足道并且大部分时间花在读取/写入文件上,请使用较小的大小。
FixedThreadPool
uses LinkedBlockingQueue
of Integer.MAX_VALUE
: FixedThreadPool
使用LinkedBlockingQueue
的Integer.MAX_VALUE
:
public static ExecutorService newFixedThreadPool(int nThreads) {
return new ThreadPoolExecutor(nThreads, nThreads,
0L, TimeUnit.MILLISECONDS,
new LinkedBlockingQueue<Runnable>());
}
So, its affectively non-blocking, as in you would be able to offer
/ put
million Runnable
instances to it, surely this is unnessary usage of memory for holding millions of objects though your fixedPoolSize
would be comparatively much much smaller say 5/10. 所以,它的情感非阻塞,就像你能够
offer
/ put
百万个Runnable
实例一样,当然,这是为了保存数百万个对象而不记得使用内存,尽管你的fixedPoolSize
相对要小得多,比如5/10。
One approach which would directly improve this scenario is to use FixedThreadPool
with a finite queue size: 直接改进这种情况的一种方法是使用具有有限队列大小的
FixedThreadPool
:
int nThreads = 10;
int maxQSize = 1000;
ExecutorService service = new ThreadPoolExecutor(nThreads, nThreads,
0L, TimeUnit.MILLISECONDS,
new LinkedBlockingQueue<Runnable>(maxQSize))
With above apporach your put
call will block on 1000
runnables in Q
, but as soon as some of them finishes, put
will continue. 通过以上apporach,您的
put
期权将阻止Q
1000
runnables,但只要其中一些完成, put
将继续。 By doing invokeAll
, there would be 10 running threads and max 1000 runnable instances. 通过执行
invokeAll
,将有10个正在运行的线程和最多1000个可运行的实例。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.