简体   繁体   中英

Is ConcurrentHashMap.get() guaranteed to see a previous ConcurrentHashMap.put() by different thread?

Is ConcurrentHashMap.get() guaranteed to see a previous ConcurrentHashMap.put() by different thread? My expectation is that is is, and reading the JavaDocs seems to indicate so, but I am 99% convinced that reality is different. On my production server the below seems to be happening. (I've caught it with logging.)

Pseudo code example:

static final ConcurrentHashMap map = new ConcurrentHashMap();
//sharedLock is key specific.  One map, many keys.  There is a 1:1 
//      relationship between key and Foo instance.
void doSomething(Semaphore sharedLock) {
    boolean haveLock = sharedLock.tryAcquire(3000, MILLISECONDS);

    if (haveLock) {
        log("Have lock: " + threadId);
        Foo foo = map.get("key");
        log("foo=" + foo);

        if (foo == null) {
            log("New foo time! " + threadId);
            foo = new Foo(); //foo is expensive to instance
            map.put("key", foo);

        } else
            log("Found foo:" + threadId);

        log("foo=" + foo);
        sharedLock.release();

    } else
        log("No lock acquired");
} 

What seems to be happening is this:

Thread 1                          Thread 2
 - request lock                    - request lock
 - have lock                       - blocked waiting for lock
 - get from map, nothing there
 - create new foo
 - place new foo in map
 - logs foo.toString()
 - release lock
 - exit method                     - have lock
                                   - get from map, NOTHING THERE!!! (Why not?)
                                   - create new foo
                                   - place new foo in map
                                   - logs foo.toString()
                                   - release lock
                                   - exit method

So, my output looks like this:

Have lock: 1    
foo=null
New foo time! 1
foo=foo@cafebabe420
Have lock: 2    
foo=null
New foo time! 2
foo=foo@boof00boo    

The second thread does not immediately see the put! Why? On my production system, there are more threads and I've only seen one thread, the first one that immediately follows thread 1, have a problem.

I've even tried shrinking the concurrency level on ConcurrentHashMap to 1, not that it should matter. Eg:

static ConcurrentHashMap map = new ConcurrentHashMap(32, 1);

Where am I going wrong? My expectation? Or is there some bug in my code (the real software, not the above) that is causing this? I've gone over it repeatedly and am 99% sure I'm handling the locking correctly. I cannot even fathom a bug in ConcurrentHashMap or the JVM. Please save me from myself.

Gorey specifics that might be relevant:

  • quad-core 64-bit Xeon (DL380 G5)
  • RHEL4 ( Linux mysvr 2.6.9-78.0.5.ELsmp #1 SMP ... x86_64 GNU/Linux )
  • Java 6 ( build 1.6.0_07-b06 , 64-Bit Server VM (build 10.0-b23, mixed mode) )

This issue of creating an expensive-to-create object in a cache based on a failure to find it in the cache is known problem. And fortunately this had already been implemented.

You can use MapMaker from Google Collecitons . You just give it a callback that creates your object, and if the client code looks in the map and the map is empty, the callback is called and the result put in the map.

See MapMaker javadocs ...

 ConcurrentMap<Key, Graph> graphs = new MapMaker()
       .concurrencyLevel(32)
       .softKeys()
       .weakValues()
       .expiration(30, TimeUnit.MINUTES)
       .makeComputingMap(
           new Function<Key, Graph>() {
             public Graph apply(Key key) {
               return createExpensiveGraph(key);
             }
           });

BTW, in your original example there is no advantage to using a ConcurrentHashMap, as you are locking each access, why not just use a normal HashMap inside your locked section?

Some good answers here, but as far as I can tell no-one has actually provided a canonical answer to the question asked: "Is ConcurrentHashMap.get() guaranteed to see a previous ConcurrentHashMap.put() by different thread". Those that have said yes haven't provided a source.

So: yes, it is guaranteed. Source (see the section 'Memory Consistency Properties'):

Actions in a thread prior to placing an object into any concurrent collection happen-before actions subsequent to the access or removal of that element from the collection in another thread.

One thing to consider, is whether your keys are equal and have identical hashcodes at both times of the "get" call. If they're just String s then yes, there's not going to be a problem here. But as you haven't given the generic type of the keys, and you have elided "unimportant" details in the pseudocode, I wonder if you're using another class as a key.

In any case, you may want to additionally log the hashcode of the keys used for the gets/puts in threads 1 and 2. If these are different, you have your problem. Also note that key1.equals(key2) must be true; this isn't something you can log definitively, but if the keys aren't final classes it would be worth logging their fully qualified class name, then looking at the equals() method for that class/classes to see if it's possible that the second key could be considered unequal to the first.

And to answer your title - yes, ConcurrentHashMap.get() is guaranteed to see any previous put(), where "previous" means there is a happens-before relationship between the two as specified by the Java Memory Model. (For the ConcurrentHashMap in particular, this is essentially what you'd expect, with the caveat that you may not be able to tell which happens first if both threads execute at "exactly the same time" on different cores. In your case, though, you should definitely see the result of the put() in thread 2).

If a thread puts a value in concurrent hash map then some other thread that retrieves the value for the map is guaranteed to see the values inserted by the previous thread.

This issue has been clarified in "Java Concurrency in Practice" by Joshua Bloch.

Quoting from the text :-

The thread-safe library collections offer the following safe publication guarantees, even if the javadoc is less than clear on the subject:

  • Placing a key or value in a Hashtable , synchronizedMap or Concurrent-Map safely publishes it to any other thread that retrieves it from the Map (whether directly or via an iterator);

I don't think the problem is in "ConcurrentHashMap" but rather somewhere in your code or about the reasoning about your code. I can't spot the error in the code above (maybe we just don't see the bad part?).

But to answer your question "Is ConcurrentHashMap.get() guaranteed to see a previous ConcurrentHashMap.put() by different thread?" I've hacked together a small test program.

In short: No, ConcurrentHashMap is OK!

If the map is written badly the following program shoukd print "Bad access!" at least from time to time. It throws 100 Threads with 100000 calls to the method you outlined above. But it prints "All ok!".

import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.Callable;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Semaphore;
import java.util.concurrent.TimeUnit;

public class Test {
    private final static ConcurrentHashMap<String, Test> map = new ConcurrentHashMap<String, Test>();
    private final static Semaphore lock = new Semaphore(1);
    private static int counter = 0;

    public static void main(String[] args) throws InterruptedException {
        ExecutorService pool = Executors.newFixedThreadPool(100);
        List<Callable<Boolean>> testCalls = new ArrayList<Callable<Boolean>>();
        for (int n = 0; n < 100000; n++)
            testCalls.add(new Callable<Boolean>() {
                @Override
                public Boolean call() throws Exception {
                    doSomething(lock);
                    return true;
                }
            });
        pool.invokeAll(testCalls);
        pool.shutdown();
        pool.awaitTermination(5, TimeUnit.SECONDS);
        System.out.println("All ok!");
    }

    static void doSomething(Semaphore lock) throws InterruptedException {
        boolean haveLock = lock.tryAcquire(3000, TimeUnit.MILLISECONDS);

        if (haveLock) {
            Test foo = map.get("key");
            if (foo == null) {
                foo = new Test();
                map.put("key", new Test());
                if (counter > 0)
                    System.err.println("Bad access!");
                counter++;
            }
            lock.release();
        } else {
            System.err.println("Fail to lock!");
        }
    }
}

Update: putIfAbsent() is logically correct here, but doesn't avoid the problem of only creating a Foo in the case where the key is not present. It always creates the Foo, even if it doesn't end up putting it in the map. David Roussel's answer is good, assuming you can accept the Google Collections dependency in your app.


Maybe I'm missing something obvious, but why are you guarding the map with a Semaphore? ConcurrentHashMap (CHM) is thread-safe (assuming it's safely published, which it is here). If you're trying to get atomic "put if not already in there", use chm. putIfAbsent() . If you need more complciated invariants where the map contents cannot change, you probably need to use a regular HashMap and synchronize it as usual.

To answer your question more directly: Once your put returns, the value you put in the map is guaranteed to be seen by the next thread that looks for it.

Side note, just a +1 to some other comments about putting the semaphore release in a finally.

if (sem.tryAcquire(3000, TimeUnit.MILLISECONDS)) {
    try {
        // do stuff while holding permit    
    } finally {
        sem.release();
    }
}

Are we seeing an interesting manifestation of the Java Memory Model? Under what conditions are registers flushed to main memory? I think it's guaranteed that if two threads synchronize on the same object then they will see a consistent memory view.

I don't know what Semphore does internally, it almost obviously must do some synchronize, but do we know that?

What happens if you do

synchronize(dedicatedLockObject)

instead of aquiring the semaphore?

Why are you locking a concurrent hash map? By def. its thread safe. If there's a problem, its in your locking code. That's why we have thread safe packages in Java The best way to debug this is with barrier synchronization.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM