简体   繁体   English

同时填充线程安全的Set

[英]Populate a thread-safe Set concurrently

Assuming I have a thread-safe collection, for which I would populate in the following manner: 假设我有一个线程安全的集合,我将以下列方式填充:

   Set set = new HashSet();
   for (Map map : maps) {
        set.addAll(doSomeExpensiveProcessing(map.keySet()));
    }

What would be the best way of performing this concurrently? 同时执行此操作的最佳方式是什么? (ie each map would concurrently add its keys to the set. (即每个地图会同时将其键添加到集合中。

EDIT - I'm aware HashSet is not thread-safe, but that would be outside the scope of the question, as far I'm concerned. 编辑 - 我知道HashSet不是线程安全的,但是就我所关注的问题而言,这不属于问题的范围。

EDIT2 - It was correctly pointed that for this particular scenario concurrency will not reap huge benefits, but there will be additional steps, which I've now included in the code example. EDIT2 - 正确地指出,对于这种特殊情况,并发性不会带来巨大的好处,但会有其他步骤,我现在已经包含在代码示例中。

This should work: 这应该工作:

// NB - Be sure to use a concurrent form of Set here.
Set set = new HashSet();
ArrayList<Map> maps = new ArrayList<>();

public void test() {
  for (final Map map : maps) {
    new Thread(new Runnable() {
      @Override
      public void run() {
        set.addAll(map.keySet());
      }
    }).start();
  }
}

I realise you are not interested in the implementation of a HashSet that is concurrent but for completeness I would like to mention the options. 我意识到你对并发的HashSet的实现不感兴趣,但为了完整性,我想提一下选项。

You could consider a ConcurrentSkipListSet if your objects implement Comparable , alternatively a Collections.newSetFromMap(new ConcurrentHashMap<Object,Boolean>()) would do. 如果你的对象实现Comparable ,你可以考虑一个ConcurrentSkipListSet ,或者Collections.newSetFromMap(new ConcurrentHashMap<Object,Boolean>())可以。

While @OldCurmudgeon has a nice basic approach, in more serious code you probably want to make a Callable that does the expensive processing of the keys, and returns a new Collection . 虽然@OldCurmudgeon有一个很好的基本方法,但在更严重的代码中你可能想要制作一个Callable来处理密钥的昂贵处理,并返回一个新的Collection That can be combined with an Executor and/or a CompletionService. 这可以与Executor和/或CompletionService结合使用。 You don't even need a concurrent collection at the end. 你甚至不需要最后的并发集合。

eg, if the keys are Strings 例如,如果键是字符串

public class DoesExpensiveProcessing implements Callable<Set<String>> {

   final Set<String> inKeys;

   public DoesExpensiveProcessing(Set<String> keys) {
     this.inKeys = keys;  // make a defensive copy if required...
   }

   public Set<String> call() {
      // do expensive processing on inKeys and returns a Set of Strings
   }
}

At this point you don't even need a parallel collection 此时您甚至不需要并行收集

List<DoesExpensiveProcessing> doInParallel = new ArrayList<DoesExpensiveProcessing>();
for (Map map : maps) {
   doInParallel.add(new DoesExpensiveProcessing(map.keySet()));
}

Set theResultingSet = new HashSet<String>();
List<Future<Set<String>>> futures = someExecutorService.invokeAll(doInParallel);
for (Future<Set<String>> f : futures) {
  theResultingSet.addAll(f.get());
}

That way it would not be concurrently, but at least threadsafe: 这样它就不会同时发生,但至少是线程安全的:

Set set = Collections.synchronizedSet(new HashSet());
...
// in some other threads:
for (Map map : maps) {
  set.addAll(map.keySet());
}

Or do you prefer something like the following: 或者您更喜欢以下内容:

ConcurrentMap<Object, Boolean> set = new ConcurrentHashMap<Object, Boolean>();
...
// in some other threads:
for (Map map : maps) {
  for (Object o : map.keySet()) {
    set.putIfAbsent(o, true);
  }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM