简体   繁体   中英

Java thread-safe passing of collection objects from one thread to another

I have a Java application which has worker threads to process jobs. A worker produces a result object, say something like:

class WorkerResult{
    private final Set<ResultItems> items;
    public Worker(Set<ResultItems> pItems){
         items = pItems;
    }
}

When the worker finishes, it does this operation:

 ...
 final Set<ResultItems> items = new SomeNonThreadSafeSetImplSet<ResultItems>();
 for(Item producedItem : ...){
      items.add(item);
 }
 passToGatherThread(items);

The items set is kind of a "unit-of-work" here. The passToGatherThread method passes the items set to a gather thread, of which only one exists at runtime.

Synchronization is not needed here, since race conditions cannot occur because only one thread (Gather-thread) reads the items set. AFAICS, the Gather-thread may not see all items because the set is not thread-safe, right?

Assume I cannot make passToGatherThread synchronized, say because it is a 3rd party library. What I basically fear is that the gather thread does not see all items because of caching, VM optimizations, etc. So here comes the question: How to pass the items set in a thread-safe manner, such that the Gather thread "sees" the proper set of items?

There seems to be no synchronization issue here. You create a new Set object for each passToGatherThread and do it after modifying the set. No objects will be lost.

Set (and most Java collections) can be accessed concurrently by many threads provided that no modification to the collection is made. That's what Collections.unmodifiableCollection is for.

Since the mentioned passToGatherThread method serves as a communication with other thread, it must use some kind of synchronization -- and each synchronization ensures memory consistency between threads.

Also - please note, that all writes to the objects in the passed collection are made before it is passed to the other thread. Even if the memory is copied into thread's local cache, it has the same unmodified value as in the other thread.

You could simply use one of the thread-safe implementations of Set that Java provides for your WorkerResult . See for example:

Another option is to use Collections.synchronizedSet() .

I have thought about (and discussed) this question a lot and I have come up with another answer, which, I hope, will be the best solution.

Passing a synchronized collection is not good in terms of efficiency, because each subsequent operation on that collection will be synchronized - if there are many operations, it may prove to be a handicap.

To the point: let's make some assumptions (which I do not agree with):

  • the mentioned passToGatherThread method is indeed unsafe, however improbable it seems
  • compiler can reorder the events in the code so that the passToGatherThread is called before the collection is filled

The simplest, cleanest and possibly the most efficient way to ensure that the collection passed to gatherer method is ready and complete is to put the collection push in a synchronized block, like this:

synchronized(items) {
  passToGatherThread(items);
}

That way we ensure a memory synchronization and a valid happen-before sequence before the collection is passed, thus making sure that all objects are passed correctly.

The worker implements callable and returns WorkerResult:

class Worker implements Callable<WorkerResult> {
    private WorkerInput in;

    public Worker(WorkerInput in) {
        this.in = in;
    }

    public WorkerResult call() {
        // do work here
    }
}

Then we use an ExecutorService to manage the thread pool, and collect the results via using Future.

public class PooledWorkerController {

    private static final int MAX_THREAD_POOL = 3;
    private final ExecutorService pool = 
       Executors.newFixedThreadPool(MAX_THREAD_POOL);

    public Set<ResultItems> process(List<WorkerInput> inputs) 
           throws InterruptedException, ExecutionException{         
        List<Future<WorkerResult>> submitted = new ArrayList<>();
        for (WorkerInput in : inputs) {
            Future<WorkerResult> future = pool.submit(new Worker(in));
            submitted.add(future);
        }
        Set<ResultItems> results = new HashSet<>();
        for (Future<WorkerResult> future : submitted) {
            results.addAll(future.get().getItems());
        }
        return results;
    }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM