简体   繁体   中英

How can we have entry-level locking in HashMap?

Since, statckoverflow does not allow add more thing to your question in the original question (you can only add comment, not code) I am asking a sequential question to my original question here: Can we use Synchronized for each entry instead of ConcurrentHashMap?

The problem is very simple, and I don't know why for such a simple problem that probably many people have encountered before me I should spend this much time :/

The problem is: I have a hashmap, I want when one thread is working on one of the entries of the hashMap, no any other thread access that object, and I don't want to lock the whole hashMap.

I know that java provides ConcurrentHashMap, but ConcurrentHashMap does not solve the problem, when you want to do thing more complex than simple put and get. Even newly added functions (in Java 8) like merge is not enough for complex scenarios.

For example:

Suppose I want a hash map that maps strings to ArrayLists. Then for example suppose I want to do this: For key k, if there is any entry, add newString to its ArrayList, but if there is no entry for k, create the entry for k such that its ArrayList has newString.

I was thinking I can do it as follows:

                ArrayList<String> tm =new ArrayList<String>();
                tm.add(newString);
                Object result = map.putIfAbsent(k, tm);
                if  (result != null)
                {
                    map.get(k).add(newString);
                }

But it does not work, why? suppose putIfAbset return something other than null, then it means that map already has an entry with key k, so I will try to add newString to the ArrayList of the already existing entry, but right before adding, another thread may remove the entry, and then I will get NullPointerException!

So, I found it very difficult to code such things properly.

But I was thinking that if I can simply lock that entry, life will be wonderful!

In my previous post I suggested something very simple that in fact eliminates the need for concurrentHashMap, and provide entry-level locking but some said that is not true because Long is not immutable ... that I didn't get it well.

Now, I implemented and tested it, it looks good to me, but I don't know why other more experienced developers here told me it is not thread-safe :(

This is the exact code that I tested:

MainThread:

import java.util.HashMap;

public class mainThread {

public static HashMap<String, Long> map = new HashMap<String, Long>();

public static void main (String args[])
{
    map.put("k1", new Long(32));


    synchronized(map.get("k1"))
    {
        Thread t = new Thread(new threadA());
        t.start();
        try {
            Thread.sleep(5000);
        } catch (InterruptedException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }

    }
}
}

ThreadA:

public class ThreadA implements Runnable {

    @Override
    public void run() {
    mainThread.map.put("k2", new Long(21));
    System.out.println(mainThread.map.get("k2"));


    synchronized (mainThread.map.get("k1")) {
        System.out.println("Insdie synchronized of threadA");
    }
}
}

It works fine! It prints 21, and after 5 seconds, that mainThread release the lock of map.get("k1"), it prints "Insdie synchronized of threadA"

So, why using this simple approach we cannot provide entry-level locking?! why concurrency should be that much complicated Lol (just kidding)

First of all, there is no standard map implementation that I am aware of that provides entry level locking.

But I think you can avoid the need for that. For example

UPDATE ... corrected mistake

ArrayList<String> tm = new ArrayList<String>();
ArrayList<String> old = map.putIfAbsent(k, tm);
if (old != null) {
    tm = old;
}
synchronized (tm) {
    // can now add / remove entries and this will appear as an atomic
    // actions to other threads that are using `synchronized` to 
    // access or update the list
    tm.add(string1);
    tm.add(string2);
}

Yes it is possible that another thread will update the list in the hashmap entry between this thread (possibly) inserting it, and this thread locking it. However, that doesn't matter. The (corrected) putIfAbsent and the test that follows ensures that everyone will use and lock the same list.

(Assumption: that all threads use this logic when inserting / updating an entry.)


Atomically removing the list if it becomes empty is difficult, but I would argue that it is usually unnecessary to do that.


UPDATE 2

There is a better way:

ArrayList<String> tm = map.computeIfAbsent(k, ArrayList::new);
synchronized (tm) {
    ...
}

(Thanks Stuart)


UPDATE 3

We can do it with merger too.

Maybe, yes. Something like this:

ArrayList<String> tm = new ArrayList<String>;
tm.add(...);
...
map.merge(key, tm, (oldV, newV) -> {oldV.addAll(newV); return oldV});

The downside is that you are double-handling all the elements of tm ; ie adding to 2 separate lists (one of which you throw way).

But you could also do this:

map.merge(key, tm, (oldV, newV) -> {
      oldV.removeAll(newV); 
      return oldV.size() == 0 ? null : oldV}
);

The thing that concerns me is that the javadoc does not state explicitly that the value oldV will be locked while this is happening. It says:

"The entire method invocation is performed atomically. Some attempted update operations on this map by other threads may be blocked while computation is in progress ..."

... but it does not explicitly state that there is mutual exclusion on the value while this is happening. (For instance, mixing this approach with putIfAbsent / computeIfAbsent and an explicit synchronized block would most likely be hazardous. The locking would most likely be on different objects.)

Well, the first huge problem is that you don't even attempt to do any locking for the put calls. Those aren't automatically threadsafe for a regular HashMap. You seem to be under the impression that separate HashMap entries are completely independent automatically, but HashMaps don't work that way.

Even if you fix the put problem (probably requiring ConcurrentHashMap or a whole-map lock anyway), the parts you actually are locking for aren't locking safely.

Say thread 1 put s the entry "k1": 1 , and thread 2 tries to get("k1") . What will thread 2 see?

Well, thread 2 doesn't even try to acquire any locks until the get call is already done . The get call is completely unprotected! Without any happens-before relation between the put and the get , the get call might not see the entry, or it might see the entry, or it might see the map in an inconsistent intermediate state and crash horribly.

Synchronizing on the result of the get call is synchronizing far too late.

I think I have finally found the solution using merge function. I provide an example, I will edit this post to make it easier for others to read, but I just post now to have your feedback.

Here is the example of a ConcurrentHashMap that has ConcurrentHashMaps as its values (23 and 1 are just two random value for sake of example):

    Long intialValue = new Long(3);
    Long addedValue = new Long(10);
    Long removingValue = new Long (5);

    ConcurrentHashMap<Integer, ConcurrentHashMap<Integer, Long>> map = new ConcurrentHashMap<>();

    //Initialization....
    ConcurrentHashMap<Integer, Long> i = new ConcurrentHashMap<Integer, Long>();
    i.put(1, intialValue);
    map.put(23, i);
    //......

    //addition
    ConcurrentHashMap<Integer, Long> c = new ConcurrentHashMap<Integer, Long>();
    c.put(1, addedValue);
    map.merge(23, c, (oldHashMap, newHashMap) -> {
        oldHashMap.merge (1, c.get(1), (oldV, newV) -> { 
            if (oldV < newV) return newV; else return oldV; 
            });
        return oldHashMap;
        });



    //removal
    // we want to remove entry 1 from the inner HashMap if its value is less than 2, and if the entry is empty remove the entry from the outer HashMap 
    ConcurrentHashMap<Integer, Long> r = new ConcurrentHashMap<Integer, Long>();
    r.put(1, removingValue);
    map.merge (23, r, (oldHashMap, newHashMap) -> {
    oldHashMap.merge(1, newHashMap.get(1), (oldV, newV) -> {if  (oldV  < newV) return newV; else return oldV;});
    return oldHashMap;
    });
    map.remove(23, r);

    if (map.containsKey(23))
    {
        System.out.println("Map contains key 23");
        if (map.get(23).containsKey(1))
        {
            System.out.println("The value for <23,1> is " + map.get(23).get(1));
        }
    }

This is what the code does:

  1. initialization : first creates the map and puts another map into it for key 23 which has value initialValue for key 1.
  2. addition : Then checks, 1) if for key 23, there is no value, it puts a map that has value addedValue for key 1, otherwise 2) if key 23 has already a value, it checks its value if the value has a value less than addedValue , it overwrites it with the addedValue , otherwise it leaves it alone.
  3. removal : Finally, it checks, if for key 23, and for key 1 in the value for 23, the value is less than removingValue , it removes that, and if the hashMap of key 23 is empty after this removal, it removes key 23 from the main map.

I tested this code. So for example:

  • for 3, 10, 5, the final value for <23,1> is 10.
  • for 20, 10, 11, the final value is 20.
  • for 3, 10, 11, the final value is nothing,because entry 23 is removed.

I hope it is thread-safe as I just used merge method. One disadvantage of this code is that I am adding something to map and then remove it, just because ConcurrentHashMap does not have a method for remove similar to merge. I wish I had this method:

map.remove (keyToRemove, condition )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM