简体   繁体   English

Java集合线程安全地将集合对象从一个线程传递到另一个线程

[英]Java thread-safe passing of collection objects from one thread to another

I have a Java application which has worker threads to process jobs. 我有一个Java应用程序,它有工作线程来处理作业。 A worker produces a result object, say something like: 一个worker产生一个结果对象,比如说:

class WorkerResult{
    private final Set<ResultItems> items;
    public Worker(Set<ResultItems> pItems){
         items = pItems;
    }
}

When the worker finishes, it does this operation: 当工人完成时,它执行此操作:

 ...
 final Set<ResultItems> items = new SomeNonThreadSafeSetImplSet<ResultItems>();
 for(Item producedItem : ...){
      items.add(item);
 }
 passToGatherThread(items);

The items set is kind of a "unit-of-work" here. 这些items在这里是一种“工作单元”。 The passToGatherThread method passes the items set to a gather thread, of which only one exists at runtime. passToGatherThread方法将设置的items传递给收集线程,其中只有一个在运行时存在。

Synchronization is not needed here, since race conditions cannot occur because only one thread (Gather-thread) reads the items set. 这里不需要同步,因为竞争条件不会发生,因为只有一个线程(Gather-thread)读取items集。 AFAICS, the Gather-thread may not see all items because the set is not thread-safe, right? AFAICS,Gather-thread可能看不到所有项目,因为该集合不是线程安全的,对吧?

Assume I cannot make passToGatherThread synchronized, say because it is a 3rd party library. 假设我无法使passToGatherThread同步,因为它是第三方库。 What I basically fear is that the gather thread does not see all items because of caching, VM optimizations, etc. So here comes the question: How to pass the items set in a thread-safe manner, such that the Gather thread "sees" the proper set of items? 我基本上担心的是收集线程由于缓存,VM优化等而没有看到所有项目。所以这里出现了一个问题:如何以线程安全的方式传递项目,以便Gather线程“看到”适当的项目?

There seems to be no synchronization issue here. 这里似乎没有同步问题。 You create a new Set object for each passToGatherThread and do it after modifying the set. 您为每个passToGatherThread创建一个新的Set对象,并在修改该集后执行此操作。 No objects will be lost. 不会丢失任何对象。

Set (and most Java collections) can be accessed concurrently by many threads provided that no modification to the collection is made. 如果不对集合进行任何修改,则许多线程可以同时访问Set(和大多数Java集合)。 That's what Collections.unmodifiableCollection is for. 这就是Collections.unmodifiableCollection的用途。

Since the mentioned passToGatherThread method serves as a communication with other thread, it must use some kind of synchronization -- and each synchronization ensures memory consistency between threads. 由于提到的passToGatherThread方法用作与其他线程的通信,因此它必须使用某种同步 - 并且每次同步都可确保线程之间的内存一致性。

Also - please note, that all writes to the objects in the passed collection are made before it is passed to the other thread. 另外 - 请注意,传递集合中对象的所有写入都是传递给另一个线程之前进行的。 Even if the memory is copied into thread's local cache, it has the same unmodified value as in the other thread. 即使将内存复制到线程的本地缓存中,它也具有与另一个线程中相同的未修改值。

You could simply use one of the thread-safe implementations of Set that Java provides for your WorkerResult . 您可以简单地使用Java为您的WorkerResult提供的Set一个线程安全实现。 See for example: 参见例如:

Another option is to use Collections.synchronizedSet() . 另一种选择是使用Collections.synchronizedSet()

I have thought about (and discussed) this question a lot and I have come up with another answer, which, I hope, will be the best solution. 我已经考虑过(并讨论过)这个问题了很多,我想出了另一个答案,我希望这将是最好的解决方案。

Passing a synchronized collection is not good in terms of efficiency, because each subsequent operation on that collection will be synchronized - if there are many operations, it may prove to be a handicap. 传递同步集合在效率方面并不好,因为该集合上的每个后续操作都将同步 - 如果有许多操作,它可能被证明是一个障碍。

To the point: let's make some assumptions (which I do not agree with): 要点:让我们做一些假设(我不同意):

  • the mentioned passToGatherThread method is indeed unsafe, however improbable it seems 提到的passToGatherThread方法确实不安全,但似乎不太可能
  • compiler can reorder the events in the code so that the passToGatherThread is called before the collection is filled 编译器可以重新排序代码中的事件,以便在填充集合之前调用passToGatherThread

The simplest, cleanest and possibly the most efficient way to ensure that the collection passed to gatherer method is ready and complete is to put the collection push in a synchronized block, like this: 确保传递给gatherer方法的集合准备就绪并且完整的最简单,最简洁且可能最有效的方法是将集合推送到同步块中,如下所示:

synchronized(items) {
  passToGatherThread(items);
}

That way we ensure a memory synchronization and a valid happen-before sequence before the collection is passed, thus making sure that all objects are passed correctly. 这样我们就可以在传递集合之前确保内存同步和有效的发生前序列,从而确保正确传递所有对象。

The worker implements callable and returns WorkerResult: worker实现了callable并返回WorkerResult:

class Worker implements Callable<WorkerResult> {
    private WorkerInput in;

    public Worker(WorkerInput in) {
        this.in = in;
    }

    public WorkerResult call() {
        // do work here
    }
}

Then we use an ExecutorService to manage the thread pool, and collect the results via using Future. 然后我们使用ExecutorService来管理线程池,并使用Future收集结果。

public class PooledWorkerController {

    private static final int MAX_THREAD_POOL = 3;
    private final ExecutorService pool = 
       Executors.newFixedThreadPool(MAX_THREAD_POOL);

    public Set<ResultItems> process(List<WorkerInput> inputs) 
           throws InterruptedException, ExecutionException{         
        List<Future<WorkerResult>> submitted = new ArrayList<>();
        for (WorkerInput in : inputs) {
            Future<WorkerResult> future = pool.submit(new Worker(in));
            submitted.add(future);
        }
        Set<ResultItems> results = new HashSet<>();
        for (Future<WorkerResult> future : submitted) {
            results.addAll(future.get().getItems());
        }
        return results;
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM