简体   繁体   中英

Java ConcurrentHashMap

In an application where 1 thread is responsible for updating a map continuously and the main thread periodically reads the map, is it sufficient to use a ConcurrentHashmap? Or should I explicitly lock operations in synchronize blocks? Any explanation would be great.

Update

I have a getter and a setter for the map (encapsulated in a custom type) which can be used simultaneously by both threads, is a ConcurrentHashMap still a good solution? Or maybe I should synchronize the getter/setter (or perhaps declare the instance variable to be volatile)? Just want to make sure that this extra detail doesn't change the solution.

As long as you perform all operation in one method call to the concurrent hash map, you don't need to use additional locking. Unfortunately if you need to perform a number of methods atomically, you have to use locking, in which case using concurrent hash map doesn't help and you may as well use a plain HashMap.

@James' suggestion got me thinking as to whether tuning un-need concurrency makes a ConcurrentHashMap faster. It should reduce memory, but you would need to have thousands of these to make much difference. So I wrote this test and it doesn't appear obvious that you would always need to tune the concurrency level.

warmup: Average access time 36 ns.
warmup2: Average access time 28 ns.
1 concurrency: Average access time 25 ns.
2 concurrency: Average access time 25 ns.
4 concurrency: Average access time 25 ns.
8 concurrency: Average access time 25 ns.
16 concurrency: Average access time 24 ns.
32 concurrency: Average access time 25 ns.
64 concurrency: Average access time 26 ns.
128 concurrency: Average access time 26 ns.
256 concurrency: Average access time 26 ns.
512 concurrency: Average access time 27 ns.
1024 concurrency: Average access time 28 ns.

Code

    public static void main(String[] args) {
    test("warmup", new ConcurrentHashMap());
    test("warmup2", new ConcurrentHashMap());
    for(int i=1;i<=1024;i+=i)
    test(i+" concurrency", new ConcurrentHashMap(16, 0.75f, i));
}

private static void test(String description, ConcurrentHashMap map) {
    Integer[] ints = new Integer[2000];
    for(int i=0;i<ints.length;i++)
        ints[i] = i;
    long start = System.nanoTime();
    for(int i=0;i<20*1000*1000;i+=ints.length) {
        for (Integer j : ints) {
            map.put(j,1);
            map.get(j);
        }
    }
    long time = System.nanoTime() - start;
    System.out.println(description+": Average access time "+(time/20/1000/1000/2)+" ns.");
}

As @bestss points out, a larger concurrency level can be slower as it has poorer caching characteristics.

EDIT: Further to @betsss concern about whether loops get optimised if there is no method calls. Here is three loops, all the same but iterate a different number of times. They print

10M: Time per loop 661 ps.
100K: Time per loop 26490 ps.
1M: Time per loop 19718 ps.
10M: Time per loop 4 ps.
100K: Time per loop 17 ps.
1M: Time per loop 0 ps.

.

{
    int loops = 10*1000 * 1000;
    long product = 1;
    long start = System.nanoTime();
    for(int i=0;i< loops;i++)
        product *= i;
    long time = System.nanoTime() - start;
    System.out.println("10M: Time per loop "+1000*time/loops+" ps.");
}
{
    int loops = 100 * 1000;
    long product = 1;
    long start = System.nanoTime();
    for(int i=0;i< loops;i++)
        product *= i;
    long time = System.nanoTime() - start;
    System.out.println("100K: Time per loop "+1000*time/loops+" ps.");
}
{
    int loops = 1000 * 1000;
    long product = 1;
    long start = System.nanoTime();
    for(int i=0;i< loops;i++)
        product *= i;
    long time = System.nanoTime() - start;
    System.out.println("1M: Time per loop "+1000*time/loops+" ps.");
}
// code for three loops repeated

That is sufficient, as the purpose of ConcurrentHashMap is to allow lockless get / put operations, but make sure you are using it with the correct concurrency level. From the docs:

Ideally, you should choose a value to accommodate as many threads as will ever concurrently modify the table. Using a significantly higher value than you need can waste space and time, and a significantly lower value can lead to thread contention. But overestimates and underestimates within an order of magnitude do not usually have much noticeable impact. A value of one is appropriate when it is known that only one thread will modify and all others will only read. Also, resizing this or any other kind of hash table is a relatively slow operation, so, when possible, it is a good idea to provide estimates of expected table sizes in constructors.

See http://download.oracle.com/javase/6/docs/api/java/util/concurrent/ConcurrentHashMap.html .

EDIT:

The wrappered getter/setter make no difference so long as it is still being read/written to by multiple threads. You could concurrently lock the whole map, but that defeats the purpose of using a ConcurrentHashMap .

A ConcurrentHashMap is a good solution for a situation involving lots of write operations and fewer read operations. The downside is that it is not guaranteed what writes a reader will see at any particular moment. So if you require the reader to see the most up-to-date version of the map, it is not a good solution.

From the Java 6 API documentation:

Retrieval operations (including get) generally do not block, so may overlap with update operations (including put and remove). Retrievals reflect the results of the most recently completed update operations holding upon their onset. For aggregate operations such as putAll and clear, concurrent retrievals may reflect insertion or removal of only some entries.

If that is not acceptable for your project, your best solution is really a fully synchronous lock. Solutions for many write operations with few read operations, as far as I know, compromise up-to-date reads in order to achieve faster, non-blocked writing. If you do go with this solution, the Collections.synchronizedMap(...) method creates a fully-synchronized, single reader/writer wrapper for any map object. Easier than writing your own.

You'd be better off using ConcurrentHashMap, as it's implementation doesn't normally block reads. If you synchronize externally, you'll end up blocking most reads, as you don't have access to the internal knowledge of the impl. needed to not do so.

If there is only one writer it should be safe to just use ConcurrentHashMap. If you feel the need to synchronize there are other HashMaps that do the synchronization for you and will be faster than manually writing the synchronization.

Yes... and to optimize it better, you should set the concurrency level to 1.

From Javadoc:

The allowed concurrency among update operations is guided by the optional concurrencyLevel constructor argument (default 16), which is used as a hint for internal sizing. .... A value of one is appropriate when it is known that only one thread will modify and all others will only read.

该解决方案之所以有效,是因为对ConcurrentMaps的内存一致性影响:与其他并发集合一样,在将对象作为键或值放入ConcurrentMap中之前,线程中的操作在访问或从ConcurrentMap中删除该对象之后发生。在另一个线程中。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM