简体   繁体   中英

Java: Create large number of Callables or distribute iterator result to threads?

I've written an application to manipulate images. My manipulation code should be applied to all images (up to 1 Million per folder) in a folder.

So far, for each image in the folder I create a Callable (which is a worker to manipulate the images) and add it to an ArrayList . Then I use the invokeAll method of a FixedThreadPool to parallelize the work.

However, my question is: Is this good design? I have some doubts that adding 1 Million elements to the array list first really makes sense. I was thinking of passing an iterator (over the files) to all the threads and to let each thread take the next element and process it (with the problem of blocking of course, unfortunately) - but does that make sense?

I sounds ok even if it is not necessarily very efficient and does not scale very well. An alternative design could be:

  • create an ArrayBlockingQueue<File> bigger in size than your FixedThreadPool (say twice as big)
  • create a FileVisitor , let's call it ImageFileVisitor , which in the visitFile method puts the visited file in the queue - that is a blocking call so it will wait until the queue is not full
  • create as many Callable s as the size of your pool and make each of them take from the queue and do what they have to do

Note: the size of the Thread pool should be fairly small. If your image processing is very heavy, use the number of processors for the size, if it is somewhat trivial and most of the time is spent reading/writing files, use a smaller size.

FixedThreadPool uses LinkedBlockingQueue of Integer.MAX_VALUE :

public static ExecutorService newFixedThreadPool(int nThreads) {
        return new ThreadPoolExecutor(nThreads, nThreads,
                                      0L, TimeUnit.MILLISECONDS,
                                      new LinkedBlockingQueue<Runnable>());
    }

So, its affectively non-blocking, as in you would be able to offer / put million Runnable instances to it, surely this is unnessary usage of memory for holding millions of objects though your fixedPoolSize would be comparatively much much smaller say 5/10.

One approach which would directly improve this scenario is to use FixedThreadPool with a finite queue size:

int nThreads = 10;
int maxQSize = 1000;
ExecutorService service = new ThreadPoolExecutor(nThreads, nThreads,
                                          0L, TimeUnit.MILLISECONDS,
                                          new LinkedBlockingQueue<Runnable>(maxQSize))

With above apporach your put call will block on 1000 runnables in Q , but as soon as some of them finishes, put will continue. By doing invokeAll , there would be 10 running threads and max 1000 runnable instances.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM