简体   繁体   English

我应该使用Java中的哪个ThreadPool?

[英]Which ThreadPool in Java should I use?

There are a huge amount of tasks. 有大量的任务。 Each task is belong to a single group. 每个任务都属于一个组。 The requirement is each group of tasks should executed serially just like executed in a single thread and the throughput should be maximized in a multi-core (or multi-cpu) environment. 要求是每组任务都应像在单个线程中执行的那样串行执行,并且在多核(或多cpu)环境中应使吞吐量最大化。 Note: there are also a huge amount of groups that is proportional to the number of tasks. 注意:还有大量与任务数量成正比的组。

The naive solution is using ThreadPoolExecutor and synchronize (or lock). 天真的解决方案是使用ThreadPoolExecutor并进行同步(或锁定)。 However, threads would block each other and the throughput is not maximized. 但是,线程会相互阻塞,并且吞吐量不会最大化。

Any better idea? 有更好的主意吗? Or is there exist a third party library satisfy the requirement? 还是存在满足要求的第三方库?

A simple approach would be to "concatenate" all group tasks into one super task, thus making the sub-tasks run serially. 一种简单的方法是将所有组任务“串联”为一个超级任务,从而使子任务串行运行。 But this will probably cause delay in other groups that will not start unless some other group completely finishes and makes some space in the thread pool. 但这可能会导致其他组中的延迟,除非其他一些组完全完成并在线程池中留出一些空间,否则其他组将不会开始。

As an alternative, consider chaining a group's tasks. 作为替代方案,请考虑将小组的任务链接在一起。 The following code illustrates it: 以下代码对此进行了说明:

public class MultiSerialExecutor {
    private final ExecutorService executor;

    public MultiSerialExecutor(int maxNumThreads) {
        executor = Executors.newFixedThreadPool(maxNumThreads);
    }

    public void addTaskSequence(List<Runnable> tasks) {
        executor.execute(new TaskChain(tasks));
    }

    private void shutdown() {
        executor.shutdown();
    }

    private class TaskChain implements Runnable {
        private List<Runnable> seq;
        private int ind;

        public TaskChain(List<Runnable> seq) {
            this.seq = seq;
        }

        @Override
        public void run() {
            seq.get(ind++).run(); //NOTE: No special error handling
            if (ind < seq.size())
                executor.execute(this);
        }       
    }

The advantage is that no extra resource (thread/queue) is being used, and that the granularity of tasks is better than the one in the naive approach. 优点是无需使用额外的资源(线程/队列),并且任务的粒度比幼稚的方法要好。 The disadvantage is that all group's tasks should be known in advance . 缺点是应事先知道所有小组的任务

--edit-- - 编辑 -

To make this solution generic and complete, you may want to decide on error handling (ie whether a chain continues even if an error occures), and also it would be a good idea to implement ExecutorService, and delegate all calls to the underlying executor. 为了使该解决方案通用且完整,您可能需要决定错误处理(即即使发生错误,链是否仍继续),并且实现ExecutorService并将所有调用委托给底层执行程序也是一个好主意。

I would suggest to use task queues: 我建议使用任务队列:

  • For every group of tasks You have create a queue and insert all tasks from that group into it. 对于每组任务,您已经创建了一个队列,并将该组中的所有任务插入其中。
  • Now all Your queues can be executed in parallel while the tasks inside one queue are executed serially. 现在,您的所有队列都可以并行执行,而一个队列中的任务则可以串行执行。

A quick google search suggests that the java api has no task / thread queues by itself. 谷歌快速搜索表明,Java api本身没有任务/线程队列。 However there are many tutorials available on coding one. 但是,有许多关于编码的教程。 Everyone feel free to list good tutorials / implementations if You know some: 如果您知道一些,每个人都可以列出好的教程/实现:

I mostly agree on Dave's answer, but if you need to slice CPU time across all "groups", ie all task groups should progress in parallel, you might find this kind of construct useful (using removal as "lock". This worked fine in my case although I imagine it tends to use more memory): 我大多同意Dave的回答,但是如果您需要在所有“组”之间分配CPU时间,即所有任务组都应并行进行,那么您可能会发现这种构造很有用(将删除用作“锁”。我的情况,虽然我想它倾向于使用更多的内存):

class TaskAllocator {
    private final ConcurrentLinkedQueue<Queue<Runnable>> entireWork
         = childQueuePerTaskGroup();

    public Queue<Runnable> lockTaskGroup(){
        return entireWork.poll();
    }

    public void release(Queue<Runnable> taskGroup){
        entireWork.offer(taskGroup);
    }
 }

and

 class DoWork implmements Runnable {
     private final TaskAllocator allocator;

     public DoWork(TaskAllocator allocator){
         this.allocator = allocator;
     }

     pubic void run(){
        for(;;){
            Queue<Runnable> taskGroup = allocator.lockTaskGroup();
            if(task==null){
                //No more work
                return;
            }
            Runnable work = taskGroup.poll();
            if(work == null){
                //This group is done
                continue;
            }

            //Do work, but never forget to release the group to 
            // the allocator.
            try {
                work.run();
            } finally {
                allocator.release(taskGroup);
            }
        }//for
     }
 }

You can then use optimum number of threads to run the DoWork task. 然后,您可以使用最佳线程数来运行DoWork任务。 It's kind of a round robin load balance.. 这是一种循环负载平衡。

You can even do something more sophisticated, by using this instead of a simple queue in TaskAllocator (task groups with more task remaining tend to get executed) 通过使用它而不是TaskAllocator中的简单队列,您甚至可以做一些更复杂的TaskAllocator (任务组剩余的任务更多,往往会被执行)

ConcurrentSkipListSet<MyQueue<Runnable>> sophisticatedQueue = 
    new ConcurrentSkipListSet(new SophisticatedComparator());

where SophisticatedComparator is SophisticatedComparator在哪里

class SophisticatedComparator implements Comparator<MyQueue<Runnable>> {
    public int compare(MyQueue<Runnable> o1, MyQueue<Runnable> o2){
        int diff = o2.size() - o1.size();
        if(diff==0){
             //This is crucial. You must assign unique ids to your 
             //Subqueue and break the equality if they happen to have same size.
             //Otherwise your queues will disappear...
             return o1.id - o2.id;
        }
        return diff;
    }
 }

Actor is also another solution for this specified type of issues. Actor也是针对此指定类型问题的另一种解决方案。 Scala has actors and also Java, which provided by AKKA. Scala有演员,还有AKKA提供的Java。

I had a problem similar to your, and I used an ExecutorCompletionService that works with an Executor to complete collections of tasks. 我有类似的您一个问题,我用了一个ExecutorCompletionService与一个作品Executor对任务的完全集合。 Here is an extract from java.util.concurrent API, since Java7: 这是自Java7以来的java.util.concurrent API的摘录:

Suppose you have a set of solvers for a certain problem, each returning a value of some type Result, and would like to run them concurrently, processing the results of each of them that return a non-null value, in some method use(Result r). 假设您有一组针对某个问题的求解器,每个求解器都返回某种Result类型的值,并希望同时运行它们,并处理每个返回非空值的结果,在某种方法中使用use(Result r)。 You could write this as: 您可以这样写:

void solve(Executor e, Collection<Callable<Result>> solvers)
        throws InterruptedException, ExecutionException {
    CompletionService<Result> ecs = new ExecutorCompletionService<Result>(e);
    for (Callable<Result> s : solvers)
        ecs.submit(s);
    int n = solvers.size();
    for (int i = 0; i < n; ++i) {
        Result r = ecs.take().get();
        if (r != null)
            use(r);
    }
}

So, in your scenario, every task will be a single Callable<Result> , and tasks will be grouped in a Collection<Callable<Result>> . 因此,在您的方案中,每个任务将是一个Callable<Result> ,并且任务将被分组在Collection<Callable<Result>>

Reference: http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ExecutorCompletionService.html 参考: http : //docs.oracle.com/javase/7/docs/api/java/util/concurrent/ExecutorCompletionService.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM