简体   繁体   中英

Using ExecutorService to process jobs in parallel

I am writing a java program that needs to process a lot of URLs.
Each URLs will run the following jobs IN ORDER: download, analyze, compress

Instead of having one single thread to do all the jobs at once per URL, I want each job to have a fixed amount of threads, so that all the jobs will have threads running concurrently at any given time.

For example, the download job will have multiple threads to fetch and download URLs, as soon as one of the URL is downloaded, it will pass it on to a thread in analyze job and as soon as it completes, it will then pass on to a thread in compress job, etc.

I am thinking of using the CompletionService in java, since it returns a result as soon as its finished, but I am not sure how it works, so far my code looks like this:

ExecutorService executor = Executors.newFixedThreadPool(3);
CompletionService<DownloadedItem> completionService = new ExecutorCompletionService<DownloadedItem>(executor);

//while list has URL do {
   executor.submit(new DownloadJob(list.getNextURL());//submit to queue for download
//}

//while there is URL left do {
   Future<DownloadedItem> downloadedItem = executor.take();//take the result as soon as it finish
   //what to do here??
//}

My question is how do I move the downloaded item to the analyze job and do the work there without waiting for all the download jobs to complete? I am thinking of creating a CompletionService for each job, is that a viable method? If not, is there a better alternative way to solve this problem? Please provide examples.

Once you mention IN ORDER any attempt to use separate threads for those in order tasks will only complicate the design of your system.

In my opinion, your best shot is to have separate threads handle individual URLs at once. To do the 3 steps you can introduce another abstraction (like use 3 callables) but you still want to execute them sequentially in one thread. And no need for completion service.

You are pretty close. First submit your tasks to CompletionService instead:

completionService.submit(new DownloadJob(list.getNextURL());

Now grab Future and wait for it:

DownloadedItem> downloadedItem = executor.take().get();

Call to get() might block. Repeat the line above as many times as many items you submitted.


If you need much, much greater throughput (in your case at most three URLs will be downloaded at a time), consider async-http-client which will allow you to download from literally thousands of URLs simultaneously. It uses NIO and is event driven, no threading is involved.

What you are describing is called a Pipeline . Basically the output of the download task is the input of the analyze task. The output of analyze is the input of compress. There seem to be two options to accomplish this:

1) Let the download task know about the pipeline for ouput so that it can submit the results itself.

class DownloadTask implement Runnable {
    Executor analyzePipeline;
    public void run() {
        //Do download stuff
        analyzePipeline.submit(new AnalyzeTask(downloaded content));
    }
}

2) Allow another thread to move the results from the download tasks into the pipeline for the analyze task.

ExecutorService executor = Executors.newFixedThreadPool(3);
ExecutorService analyzeExecutor = Executors.newFixedThreadPool(3);
CompletionService<DownloadedItem> completionService = new ExecutorCompletionService<DownloadedItem>(executor);

while list has URL do {
   executor.submit(new DownloadJob(list.getNextURL());//submit to queue for download
}

new Thread() {
    public void run() {
        while there is URL left do {
            Future<DownloadedItem> downloadedItem = executor.take();//take the result as soon as it finish
            analyzeExecutor.submit(new AnalyzeJob(downloadedItem.get());
        }
    }
};    
//...and so on

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM