最小化JDK8 ConcurrentHashMap检查和设置操作的锁定范围

Question

1. 1。

I have multiple threads updating a ConcurrentHashMap. 我有多个线程更新ConcurrentHashMap。 Each thread appends a list of integers to the Value of a map entry based on the key. 每个线程都会根据键将整数列表附加到映射项的值。 There is no removal operation by any thread. 没有任何线程执行删除操作。

The main point here is that I want to minimize the scope of locking and synchronization as much as possible. 这里的要点是，我想尽可能地减小锁定和同步的范围。

I saw that the doc of computeIf...() method says "Some attempted update operations on this map by other threads may be blocked while computation is in progress", which is not so encouraging. 我看到computeIf ...（）方法的文档显示“在计算进行过程中，其他线程对此映射进行的某些尝试的更新操作可能会被阻止”，这并不令人鼓舞。 On the other hand, when I look into its source code, I failed to observe where it locks/synchronizes on the entire map. 另一方面，当我查看其源代码时，我无法观察到它在整个地图上的锁定/同步位置。

Therefore, I wonder about the comparison of theoretical performance of using computeIf...() and the following home-grown 'method 2'. 因此，我想知道使用computeIf ...（）和下面的本地“方法2” 的理论性能之间的比较。

2. 2。

Also, I feel that the problem I described here is perhaps one of the most simplified check-n-set (or generally a 'compound') operation you can carry out on ConcurrentHashMap . 另外，我觉得我在这里描述的问题也许是可以在ConcurrentHashMap上执行的最简化的check-n-set（或者通常是“ compound”）操作之一 。

Yet I'm not quite confident and can't quite find much guideline about how to do even this kind of simple compound operations on ConcurrentHashMap, without Locking/Synchronizing on the entire map . 但是，我对如何在ConcurrentHashMap上进行这种简单的复合操作（ 而不在整个地图上进行锁定/同步）的方法并没有太多的指导 。

So any general good practice advice for this will be much appreciated. 因此，任何对此的常规良好实践建议将不胜感激。

public void myConcurrentHashMapTest1() {

    ConcurrentHashMap<String, List<Integer>> myMap = new ConcurrentHashMap<String, List<Integer>>();

    // MAP KEY: a Word found by a thread on a page of a book 
    String myKey = "word1";

    // -- Method 1: 
    // Step 1.1 first, try to use computeIfPresent(). doc says it may lock the
    //      entire myMap. 
    myMap.computeIfPresent(myKey, (key,val) -> val.addAll(getMyVals()));
    // Step 1.2 then use computeIfAbsent(). Again, doc says it may lock the
    //      entire myMap. 
    myMap.computeIfAbsent(myKey, key -> getMyVals());    
}

public void myConcurrentHashMapTest2() {        
    // -- Method 2: home-grown lock splitting (kind of). Will it theoretically 
    //      perform better? 

    // Step 2.1: TRY to directly put an empty list for the key
    //      This may have no effect if the key is already present in the map
    List<Integer> myEmptyList = new ArrayList<Integer>();
    myMap.putIfAbsent(myKey, myEmptyList);

    // Step 2.2: By now, we should have the key present in the map
    //      ASSUMPTION: no thread does removal 
    List<Integer> listInMap = myMap.get(myKey);

    // Step 2.3: Synchronize on that list, append all the values 
    synchronized(listInMap){
        listInMap.addAll(getMyVals());
    }

}

public List<Integer> getMyVals(){
    // MAP VALUE: e.g. Page Indices where word is found (by a thread)
    List<Integer> myValList = new ArrayList<Integer>(); 
    myValList.add(1);
    myValList.add(2);

    return myValList;
}

Answer 1

You're basing your assumption (that using ConcurrentHashMap as intended will be too slow for you) on a misinterpretation of the Javadoc. 您基于对Javadoc的误解（基于按预期使用ConcurrentHashMap会太慢）的假设。 The Javadoc doesn't state that the whole map will be locked. Javadoc没有声明整个地图将被锁定。 It also doesn't state that each computeIfAbsent() operation performs pessimistic locking. 它还没有声明每个computeIfAbsent()操作都执行悲观锁定。

What could actually be locked is a bin (aka bucket) which corresponds to a single element in the internal array backing of the ConcurrentHashMap . 实际上可以锁定的是一个bin（也就是存储桶），它对应于ConcurrentHashMap的内部数组支持中的单个元素。 Note that this is not Java 7's map segment containing multiple buckets. 请注意，这不是Java 7的包含多个存储桶的映射段。 When such a bin is locked, potentially blocked operations are solely updates for keys that hash to the same bin. 当此类垃圾箱被锁定时，可能被阻塞的操作仅是哈希到同一垃圾箱的密钥的更新。

On the other hand, your solution doesn't mean that all internal locking within ConcurrentHashMap is avoided - computeIfAbsent() is just one of the methods that can degrade to using a synchronized block while updating. 另一方面，您的解决方案并不意味着避免在ConcurrentHashMap中进行所有内部锁定computeIfAbsent()只是在更新时可能退化为使用synchronized块的方法之一。 Even the putIfAbsent() with which you're initially putting an empty list for some key, can block if it doesn't hit an empty bin. 即使您最初为某个键放入一个空列表的putIfAbsent() ，如果它没有命中一个空容器，也会阻塞。

What's worse though is that your solution doesn't guarantee the visibility of your synchronized bulk updates. 更糟糕的是，您的解决方案无法保证synchronized批量更新的可见性。 You are guaranteed that a get() happens-before a putIfAbsent() which value it observes, but there's no happens-before between your bulk updates and a subsequent get() . 这样保证了一个get() 之前发生一putIfAbsent()它看重它观察，但没有之前发生批量更新和后续之间get()

PS You can read further about the locking in ConcurrentHashMap in its OpenJDK implementation: http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/util/concurrent/ConcurrentHashMap.java , lines 313-352. PS：您可以在其OpenJDK实现中进一步了解ConcurrentHashMap中的锁定： http : //hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/util/concurrent/ConcurrentHashMap .java ，第313-352行。

Answer 2

As already explained by Dimitar Dimitrov , a compute… method doesn't generally lock the entire map. 正如Dimitar Dimitrov所解释的那样， compute…方法通常不会锁定整个地图。 In the best case, ie there's no need to increase the capacity and there's no hash collision, only the mapping for the single key is locked. 在最好的情况下，即不需要增加容量，也没有哈希冲突，仅锁定单个键的映射。

However, there are still things you can do better: 但是，仍有一些事情可以做得更好：

generally, avoid performing multiple lookups. 通常，避免执行多次查找。 This applies to both variants, using computeIfPresent followed by computeIfAbsent , as well as using putIfAbsent followed by get 这适用于两个变体，使用computeIfPresent然后是computeIfAbsent ，以及使用putIfAbsent然后是get
it's still recommended to minimize the code executed when holding a lock, ie don't invoke getMyVals() while holding the lock as it doesn't depend on the map's state 仍然建议尽量减少持有锁时执行的代码，即不要在持有锁时调用getMyVals() ，因为它不依赖于地图的状态

Putting it together, the update should look like: 放在一起，更新应如下所示：

// compute without holding a lock
List<Integer> toAdd=getMyVals();
// update the map
myMap.compute(myKey, (key,val) -> {
    if(val==null) val=toAdd; else val.addAll(toAdd);
    return val;
});

or 要么

// compute without holding a lock
List<Integer> toAdd=getMyVals();
// update the map
myMap.merge(myKey, toAdd, (a,b) -> { a.addAll(b); return a; });

which can be simplified to 可以简化为

myMap.merge(myKey, getMyVals(), (a,b) -> { a.addAll(b); return a; });

最小化JDK8 ConcurrentHashMap检查和设置操作的锁定范围

问题描述

2 个解决方案

解决方案1
3 已采纳 2016-04-14 23:14:02

解决方案2
0 2016-04-15 09:22:17

最小化JDK8 ConcurrentHashMap检查和设置操作的锁定范围

问题描述

2 个解决方案

解决方案1 3 已采纳 2016-04-14 23:14:02

解决方案2 0 2016-04-15 09:22:17

解决方案1
3 已采纳 2016-04-14 23:14:02

解决方案2
0 2016-04-15 09:22:17