简体   繁体   English

computeIfAbsent 如何随机使 ConcurrentHashMap 失败?

[英]How does computeIfAbsent fail ConcurrentHashMap randomly?

I have the following code, it is a toy code but makes possible to reproduce the problem:我有以下代码,它是一个玩具代码,但可以重现该问题:

import java.util.*;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;
import java.util.stream.Collectors;

import static java.util.Arrays.stream;
import static java.util.stream.Collectors.toList;

public class TestClass3 {
    public static void main(String[] args) throws InterruptedException {
        // Setup data that we will be playing with concurrently
        List<String> keys = Arrays.asList("a", "b", "c", "d", "e", "f", "g", "h", "i", "j");

        HashMap<String, List<Integer>> keyValueMap = new HashMap<>();
        for (String key : keys) {
            int[] randomInts = new Random().ints(10000, 0, 10000).toArray();
            keyValueMap.put(key, stream(randomInts).boxed().collect(toList()));
        }

        // Entering danger zone, concurrently transforming our data to another shape
        ExecutorService es = Executors.newFixedThreadPool(10);
        Map<Integer, Set<String>> valueKeyMap = new ConcurrentHashMap<>();
        for (String key : keys) {
            es.submit(() -> {
                for (Integer value : keyValueMap.get(key)) {
                    valueKeyMap.computeIfAbsent(value, val -> new HashSet<>()).add(key);
                }
            });
        }
        // Wait for all tasks in executorservice to finish
        es.shutdown();
        es.awaitTermination(1, TimeUnit.MINUTES);
        // Danger zone ends..

        // We should be in a single-thread environment now and safe
        StringBuilder stringBuilder = new StringBuilder();
        for (Integer integer : valueKeyMap.keySet()) {
            String collect = valueKeyMap
                    .get(integer)
                    .stream()
                    .sorted()  // This will blow randomly
                    .collect(Collectors.joining());
            stringBuilder.append(collect);  // just to print something..
        }
        System.out.println(stringBuilder.length());
    }
}

When I run this code over and over again, it will usually run without any exceptions and will print some number.. However from time time (1 out of 10 tries approximately) I will get an exception akin to:当我一遍又一遍地运行此代码时,它通常会在没有任何异常的情况下运行,并会打印一些数字。但是,随着时间的推移(大约 10 次尝试中的 1 次),我会得到类似于以下内容的异常:

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 6
    at java.util.stream.SortedOps$SizedRefSortingSink.accept(SortedOps.java:369)
    at java.util.HashMap$KeySpliterator.forEachRemaining(HashMap.java:1556)
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
    at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566)
    at biz.tugay.TestClass3.main(TestClass3.java:40)

I am pretty certain it has something to do with我很确定它与

valueKeyMap.computeIfAbsent(value, val -> new HashSet<>()).add(key);

If I change this part as follows, I never get an exception:如果我按如下方式更改这部分,我永远不会遇到异常:

synchronized (valueKeyMap) {
    valueKeyMap.computeIfAbsent(value, val -> new HashSet<>()).add(key);
}

I am thinking computeIfAbsent is still modifying the valueKeyMap even after all threads are finished.我在computeIfAbsent即使在所有线程都完成后仍在修改valueKeyMap

Could someone explain how come this code is failing randomly, what the reason is?有人可以解释一下这段代码是怎么随机失败的,原因是什么? Or is there a totally different reason I am unable to see perhaps and I am wrong in my assumption that computeIfAbsent is to blame?或者是否有一个完全不同的原因我可能看不到,并且我认为computeIfAbsent应该受到责备的假设是错误的?

The problem isn't in the computeIfAbsent call, but rather in the .add(key) at the end: you can have multiple threads trying to add elements to the same HashSet, with nothing to ensure safe concurrent access.问题不在于computeIfAbsent调用,而在于最后的.add(key) :您可以让多个线程尝试将元素添加到同一个 HashSet,而没有任何东西可以确保安全的并发访问。 Since HashSet isn't threadsafe, this doesn't work properly, and the HashSet sometimes ends up in a corrupt state.由于 HashSet 不是线程安全的,因此无法正常工作,并且 HashSet 有时会以损坏的 state 结束。 Later, when you try to iterate over the HashSet to get a string, it blows up due to this corrupt state.稍后,当您尝试遍历 HashSet 以获取字符串时,它会因为这个损坏的 state 而崩溃。 (Judging from your exception, the HashSet thinks its backing array is longer than it actually is, so it's trying to access out-of-bounds array elements.) (从您的例外情况来看, HashSet 认为其后备数组比实际更长,因此它试图访问越界数组元素。)

Even in the runs where you don't get an exception, you probably sometimes end up "dropping" elements that should have gotten added, but where concurrent updates mean that some updates were lost.即使在没有出现异常的运行中,您有时也可能最终“丢弃”本应添加的元素,但同时更新意味着丢失了一些更新。

ConcurrentHashMap.computeIfAbsent executes atomically, that is, only one thread can access the value associated with a given key at a time. ConcurrentHashMap.computeIfAbsent以原子方式执行,即一次只有一个线程可以访问与给定键关联的值。

However, there is no such guarantee once the value is returned.但是,一旦返回值,就没有这样的保证。 The HashSet can be accessed by multiple writing threads, and as such is not being accessed thread-safely. HashSet可以被多个写入线程访问,因此不是线程安全的。

Instead, you can do something like this:相反,您可以执行以下操作:

valueKeyMap.compute(value, (k, v) -> {
    if (v == null) {
      v = new HashSet<>();
    }
    v.add(key);
    return v;
});

which works because compute is atomic too.之所以有效,是因为compute也是原子的。

The fact that when using synchronized you do not get an exception should already shed some light as to where the problem is.使用synchronized时不会出现异常这一事实应该已经说明了问题出在哪里。 As already stated the problem is indeed the HashSet as it is not thread safe.如前所述,问题确实是HashSet ,因为它不是线程安全的。 This is also stated in the documentation of the collection.这在收藏的文档中也有说明。

Note that this implementation is not synchronized.请注意,此实现不同步。 If multiple threads access a hash set concurrently, and at least one of the threads modifies the set, it must be synchronized externally.如果多个线程同时访问一个 hash 集,并且至少有一个线程修改了该集,则必须在外部进行同步。 This is typically accomplished by synchronizing on some object that naturally encapsulates the set.这通常是通过在自然封装集合的一些 object 上同步来实现的。

The solution is to either use the synchronized block or make use of a thread safe CollectionView such as KeySetView which you can get using ConcurrentHashMap.newKeySet() .解决方案是使用synchronized块或使用线程安全的CollectionView ,例如KeySetView ,您可以使用ConcurrentHashMap.newKeySet()获得。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 ConcurrentHashMap computeIfAbsent - ConcurrentHashMap computeIfAbsent HashMap.computeIfAbsent如何在多线程使用下失败? - How does HashMap.computeIfAbsent fail under multithreaded use? 为什么 ConcurrentHashMap.computeIfAbsent() 为已经存在的键增加计数器? - Why does ConcurrentHashMap.computeIfAbsent() increment the counter for an already present key? Java ConcurrentHashMap computeIfAbsent() 方法是否支持基于键的“锁定”? - Does Java ConcurrentHashMap computeIfAbsent() method support key-based “locking”? 如何在不将条目分配给 hashmap 的情况下执行 ConcurrentHashMap.computeIfAbsent? - How to do ConcurrentHashMap.computeIfAbsent without assigning the entry to hashmap? 为什么 ConcurrentHashMap::putIfAbsent 比 ConcurrentHashMap::computeIfAbsent 快? - Why ConcurrentHashMap::putIfAbsent is faster than ConcurrentHashMap::computeIfAbsent? 检查ConcurrentHashMap的computeIfAbsent是否更改了某些内容 - Check if computeIfAbsent of ConcurrentHashMap changed something ConcurrentHashMap computeIfAbsent 判断是否是第一次 - ConcurrentHashMap computeIfAbsent tell if first time or not ConcurrentHashMap.computeIfAbsent threadsafe中的赋值是什么? - Is an assignment inside ConcurrentHashMap.computeIfAbsent threadsafe? 嵌套时,ConcurrentHashMap的computeIfAbsent线程是否安全? - Is ConcurrentHashMap's computeIfAbsent thread safe, when nested?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM