简体   繁体   English

双重检查锁定是否与Java中的最终Map一起使用?

[英]Does double-checked locking work with a final Map in Java?

I'm trying to implement a thread-safe Map cache, and I want the cached Strings to be lazily initialized. 我正在尝试实现一个线程安全的Map缓存,我希望缓存的Strings被懒惰地初始化。 Here's my first pass at an implementation: 这是我在实施中的第一次传递:

public class ExampleClass {

    private static final Map<String, String> CACHED_STRINGS = new HashMap<String, String>();

    public String getText(String key) {

        String string = CACHED_STRINGS.get(key);

        if (string == null) {

            synchronized (CACHED_STRINGS) {

                string = CACHED_STRINGS.get(key);

                if (string == null) {
                    string = createString();
                    CACHED_STRINGS.put(key, string);
                }
            }
        }

        return string;
    }
}

After writing this code, Netbeans warned me about "double-checked locking," so I started researching it. 编写完这段代码后,Netbeans警告我“双重检查锁定”,所以我开始研究它。 I found The "Double-Checked Locking is Broken" Declaration and read it, but I'm unsure if my implementation falls prey to the issues it mentioned. 我发现“双重锁定已破损”声明并阅读它,但我不确定我的实施是否成为它提到的问题的牺牲品。 It seems like all the issues mentioned in the article are related to object instantiation with the new operator within the synchronized block. 看起来本文中提到的所有问题都与使用synchronized块中的new运算符进行对象实例化有关。 I'm not using the new operator, and Strings are immutable, so I'm not sure that if the article is relevant to this situation or not. 我没有使用new运算符,并且字符串是不可变的,所以我不确定文章是否与这种情况相关。 Is this a thread-safe way to cache strings in a HashMap ? 这是一种在HashMap缓存字符串的线程安全方法吗? Does the thread-safety depend on what action is taken in the createString() method? 线程安全性取决于createString()方法中采取的操作吗?

No it's not correct because the first access is done out side of a sync block. 不,这是不正确的,因为第一次访问是在同步块的一侧完成的。

It's somewhat down to how get and put might be implemented. 这有点落到了如何实现getput You must bare in mind that they are not atomic operations . 你必须记住它们不是原子操作

For example, what if they were implemented like this: 例如,如果它们是这样实现的:

public T get(string key){
    Entry e = findEntry(key);
    return e.value;
}

public void put(string key, string value){
    Entry e = addNewEntry(key);
    //danger for get while in-between these lines
    e.value = value;
}

private Entry addNewEntry(key){
   Entry entry = new Entry(key, ""); //a new entry starts with empty string not null!
   addToBuckets(entry); //now it's findable by get
   return entry; 
}

Now the get might not return null when the put operation is still in progress, and the whole getText method could return the wrong value. 现在,当put操作仍在进行时, get可能不会返回null ,并且整个getText方法可能返回错误的值。

The example is a bit convoluted, but you can see that correct behaviour of your code relies on the inner workings of the map class. 该示例有点复杂,但您可以看到代码的正确行为依赖于map类的内部工作方式。 That's not good. 这不好。

And while you can look that code up, you cannot account for compiler, JIT and processor optimisations and inlining which effectively can change the order of operations just like the wacky but correct way I chose to write that map implementation. 虽然您可以查看该代码,但您无法考虑编译器,JIT和处理器优化以及内联哪些可以有效地改变操作顺序,就像我选择编写该地图实现的古怪但正确的方式一样。

Consider use of a concurrent hashmap and the method Map.computeIfAbsent() which takes a function to call to compute a default value if key is absent from the map. 考虑使用并发散列映射和方法Map.computeIfAbsent() ,如果映射中不存在键,则使用函数调用以计算默认值。

Map<String, String> cache = new ConcurrentHashMap<>(  );
cache.computeIfAbsent( "key", key -> "ComputedDefaultValue" );

Javadoc: If the specified key is not already associated with a value, attempts to compute its value using the given mapping function and enters it into this map unless null. Javadoc:如果指定的键尚未与值关联,则尝试使用给定的映射函数计算其值,并将其输入此映射,除非为null。 The entire method invocation is performed atomically, so the function is applied at most once per key. 整个方法调用是以原子方式执行的,因此每个键最多应用一次该函数。 Some attempted update operations on this map by other threads may be blocked while computation is in progress, so the computation should be short and simple, and must not attempt to update any other mappings of this map. 其他线程在此映射上的某些尝试更新操作可能在计算进行时被阻止,因此计算应该简短,并且不得尝试更新此映射的任何其他映射。

Non-trivial problem domains: 非平凡问题域:

Concurrency is easy to do and hard to do correctly. 并发很容易做,很难正确完成。

Caching is easy to do and hard to do correctly. 缓存很容易做,很难正确完成。

Both are right up there with Encryption in the category of hard to get right without an intimate understanding of the problem domain and its many subtle side effects and behaviors. 在没有对问题领域及其许多微妙的副作用和行为的深入理解的情况下,两者都是加密的类别。

Combine them and you get a problem an order of magnitude harder than either one. 结合它们,你会遇到一个比任何一个问题都难度更大的问题。

This is a non-trivial problem that your naive implementation will not solve in a bug free manner. 这是一个非常重要的问题,您的天真实现无法以无错误的方式解决。 The HashMap you are using is not going to threadsafe if any accesses are not checked and serialized, it will not be performant and will cause lots of contention that will cause lot of blocking and latency depending on the use. 如果没有检查和序列化任何访问,您使用的HashMap将不会线程安全,它将不会具有高性能并且会导致大量争用,这将导致大量阻塞和延迟,具体取决于使用情况。

The proper way to implement a lazy loading cache is to use something like Guava Cache with a Cache Loader it takes care of all the concurrency and cache race conditions for you transparently. 实现延迟加载缓存的正确方法是使用Guava CacheCache Loader之类的东西,它透明地处理所有并发和缓存竞争条件。 A cursory glance through the source code shows how they do it. 粗略浏览源代码可以看出它们是如何做到的。

No, and ConcurrentHashMap would not help. 不,ConcurrentHashMap也无济于事。

Recap: the double check idiom is typically about assigning a new instance to a variable/field; 回顾:双重检查习惯通常是关于为变量/字段分配新实例; it is broken because the compiler can reorder instructions, meaning the field can be assigned with a partially constructed object. 它被破坏是因为编译器可以重新排序指令,这意味着可以使用部分构造的对象来分配字段。

For your setup, you have a distinct issue: the map.get() is not safe from the put() which may be occurring thus possibly rehashing the table. 对于您的设置,您有一个明显的问题:map.get()对于可能正在发生的put()是不安全的,因此可能会重新表格。 Using a Concurrent hash map fixes ONLY that but not the risk of a false positive (that you think the map has no entry but it is actually being made). 使用并发哈希映射仅修复但不存在误报的风险(您认为地图没有条目但实际上正在制作)。 The issue is not so much a partially constructed object but the duplication of work. 问题不是部分构建的对象,而是重复工作。

As for the avoidable guava cacheloader: this is just a lazy-init callback that you give to the map so it can create the object if missing. 至于可避免的guava cacheloader:这只是一个lazy-init回调,你给地图,所以它可以创建对象,如果丢失。 This is essentially the same as putting all the 'if null' code inside the lock, which is certainly NOT going to be faster than good old direct synchronization. 这与把所有'if null'代码放在锁中基本相同,这肯定不会比旧的直接同步更快。 (The only times it makes sense to use a cacheloader is for pluggin-in a factory of such missing objects while you are passing the map to classes who don't know how to make missing objects and don't want to be told how). (唯一有意义的是,使用缓存加载器是为了插入这样的丢失对象的工厂,同时将地图传递给不知道如何制作丢失对象且不想被告知如何的类。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM