简体   繁体   English

Java ConcurrentHashMap

[英]Java ConcurrentHashMap

In an application where 1 thread is responsible for updating a map continuously and the main thread periodically reads the map, is it sufficient to use a ConcurrentHashmap? 在一个由1个线程负责连续更新映射并且主线程定期读取该映射的应用程序中,使用ConcurrentHashmap是否足够? Or should I explicitly lock operations in synchronize blocks? 还是应该在同步块中显式锁定操作? Any explanation would be great. 任何解释都很好。

Update 更新

I have a getter and a setter for the map (encapsulated in a custom type) which can be used simultaneously by both threads, is a ConcurrentHashMap still a good solution? 我有一个用于地图的getter和setter(封装在一个自定义类型中),两个线程可以同时使用它,ConcurrentHashMap还是一个好的解决方案吗? Or maybe I should synchronize the getter/setter (or perhaps declare the instance variable to be volatile)? 还是我应该同步getter / setter(或者声明实例变量为volatile)? Just want to make sure that this extra detail doesn't change the solution. 只想确保这个额外的细节不会改变解决方案。

As long as you perform all operation in one method call to the concurrent hash map, you don't need to use additional locking. 只要在对并发哈希映射的一个方法调用中执行所有操作,就无需使用其他锁定。 Unfortunately if you need to perform a number of methods atomically, you have to use locking, in which case using concurrent hash map doesn't help and you may as well use a plain HashMap. 不幸的是,如果您需要原子地执行多种方法,则必须使用锁定,在这种情况下,使用并发哈希映射无济于事,您也可以使用普通的HashMap。

@James' suggestion got me thinking as to whether tuning un-need concurrency makes a ConcurrentHashMap faster. @James的建议让我开始思考是否可以通过调整不需要的并发来提高ConcurrentHashMap的速度。 It should reduce memory, but you would need to have thousands of these to make much difference. 它应该减少内存,但是您需要有成千上万个这样的内存才能发挥很大的作用。 So I wrote this test and it doesn't appear obvious that you would always need to tune the concurrency level. 因此,我编写了此测试,似乎并不总是需要调整并发级别。

warmup: Average access time 36 ns.
warmup2: Average access time 28 ns.
1 concurrency: Average access time 25 ns.
2 concurrency: Average access time 25 ns.
4 concurrency: Average access time 25 ns.
8 concurrency: Average access time 25 ns.
16 concurrency: Average access time 24 ns.
32 concurrency: Average access time 25 ns.
64 concurrency: Average access time 26 ns.
128 concurrency: Average access time 26 ns.
256 concurrency: Average access time 26 ns.
512 concurrency: Average access time 27 ns.
1024 concurrency: Average access time 28 ns.

Code

    public static void main(String[] args) {
    test("warmup", new ConcurrentHashMap());
    test("warmup2", new ConcurrentHashMap());
    for(int i=1;i<=1024;i+=i)
    test(i+" concurrency", new ConcurrentHashMap(16, 0.75f, i));
}

private static void test(String description, ConcurrentHashMap map) {
    Integer[] ints = new Integer[2000];
    for(int i=0;i<ints.length;i++)
        ints[i] = i;
    long start = System.nanoTime();
    for(int i=0;i<20*1000*1000;i+=ints.length) {
        for (Integer j : ints) {
            map.put(j,1);
            map.get(j);
        }
    }
    long time = System.nanoTime() - start;
    System.out.println(description+": Average access time "+(time/20/1000/1000/2)+" ns.");
}

As @bestss points out, a larger concurrency level can be slower as it has poorer caching characteristics. 正如@bestss指出的那样,较大的并发级别可能较慢,因为它具有较差的缓存特性。

EDIT: Further to @betsss concern about whether loops get optimised if there is no method calls. 编辑:@betsss进一步关注如果没有方法调用,循环是否会得到优化。 Here is three loops, all the same but iterate a different number of times. 这是三个循环,所有循环都相同,但迭代次数不同。 They print 他们打印

10M: Time per loop 661 ps.
100K: Time per loop 26490 ps.
1M: Time per loop 19718 ps.
10M: Time per loop 4 ps.
100K: Time per loop 17 ps.
1M: Time per loop 0 ps.

.

{
    int loops = 10*1000 * 1000;
    long product = 1;
    long start = System.nanoTime();
    for(int i=0;i< loops;i++)
        product *= i;
    long time = System.nanoTime() - start;
    System.out.println("10M: Time per loop "+1000*time/loops+" ps.");
}
{
    int loops = 100 * 1000;
    long product = 1;
    long start = System.nanoTime();
    for(int i=0;i< loops;i++)
        product *= i;
    long time = System.nanoTime() - start;
    System.out.println("100K: Time per loop "+1000*time/loops+" ps.");
}
{
    int loops = 1000 * 1000;
    long product = 1;
    long start = System.nanoTime();
    for(int i=0;i< loops;i++)
        product *= i;
    long time = System.nanoTime() - start;
    System.out.println("1M: Time per loop "+1000*time/loops+" ps.");
}
// code for three loops repeated

That is sufficient, as the purpose of ConcurrentHashMap is to allow lockless get / put operations, but make sure you are using it with the correct concurrency level. 这就足够了,因为ConcurrentHashMap的目的是允许无锁的获取/放置操作,但是请确保您以正确的并发级别使用它。 From the docs: 从文档:

Ideally, you should choose a value to accommodate as many threads as will ever concurrently modify the table. Using a significantly higher value than you need can waste space and time, and a significantly lower value can lead to thread contention. But overestimates and underestimates within an order of magnitude do not usually have much noticeable impact. A value of one is appropriate when it is known that only one thread will modify and all others will only read. Also, resizing this or any other kind of hash table is a relatively slow operation, so, when possible, it is a good idea to provide estimates of expected table sizes in constructors.

See http://download.oracle.com/javase/6/docs/api/java/util/concurrent/ConcurrentHashMap.html . 请参阅http://download.oracle.com/javase/6/docs/api/java/util/concurrent/ConcurrentHashMap.html

EDIT: 编辑:

The wrappered getter/setter make no difference so long as it is still being read/written to by multiple threads. 包装的getter / setter没什么区别,只要它仍被多个线程读写。 You could concurrently lock the whole map, but that defeats the purpose of using a ConcurrentHashMap . 可以同时锁定整个地图,但这违反了使用ConcurrentHashMap的目的。

A ConcurrentHashMap is a good solution for a situation involving lots of write operations and fewer read operations. 对于涉及大量写入操作而较少读取操作的情况, ConcurrentHashMap是一个很好的解决方案。 The downside is that it is not guaranteed what writes a reader will see at any particular moment. 不利的一面是不能保证读者在任何特定时刻看到的内容。 So if you require the reader to see the most up-to-date version of the map, it is not a good solution. 因此,如果您要求读者查看地图的最新版本,则不是一个好的解决方案。

From the Java 6 API documentation: Java 6 API文档中:

Retrieval operations (including get) generally do not block, so may overlap with update operations (including put and remove). 检索操作(包括get)通常不会阻塞,因此可能与更新操作(包括put和remove)重叠。 Retrievals reflect the results of the most recently completed update operations holding upon their onset. 检索反映了自发生以来最新完成的更新操作的结果。 For aggregate operations such as putAll and clear, concurrent retrievals may reflect insertion or removal of only some entries. 对于诸如putAll和clear的聚合操作,并发检索可能仅反映某些条目的插入或删除。

If that is not acceptable for your project, your best solution is really a fully synchronous lock. 如果这对于您的项目不可接受,则最好的解决方案实际上是完全同步锁定。 Solutions for many write operations with few read operations, as far as I know, compromise up-to-date reads in order to achieve faster, non-blocked writing. 据我所知,许多写入操作和很少的读取操作的解决方案会折衷最新的读取操作,以实现更快的无阻塞写入。 If you do go with this solution, the Collections.synchronizedMap(...) method creates a fully-synchronized, single reader/writer wrapper for any map object. 如果确实采用此解决方案,则Collections.synchronizedMap(...)方法将为任何地图对象创建完全同步的单个读取器/写入器包装器。 Easier than writing your own. 比编写您自己的更加容易。

You'd be better off using ConcurrentHashMap, as it's implementation doesn't normally block reads. 使用ConcurrentHashMap会更好,因为它的实现通常不会阻止读取。 If you synchronize externally, you'll end up blocking most reads, as you don't have access to the internal knowledge of the impl. 如果从外部进行同步,则最终将阻止大多数读取,因为您无权访问impl的内部知识。 needed to not do so. 不需要这样做。

If there is only one writer it should be safe to just use ConcurrentHashMap. 如果只有一位作者,那么使用ConcurrentHashMap应该是安全的。 If you feel the need to synchronize there are other HashMaps that do the synchronization for you and will be faster than manually writing the synchronization. 如果您认为需要同步,则可以使用其他HashMap为您执行同步,并且比手动编写同步要快。

Yes... and to optimize it better, you should set the concurrency level to 1. 是的...为了更好地对其进行优化,应将并发级别设置为1。

From Javadoc: 从Javadoc:

The allowed concurrency among update operations is guided by the optional concurrencyLevel constructor argument (default 16), which is used as a hint for internal sizing. 更新操作之间允许的并发性由可选的concurrencyLevel构造函数参数(默认为16)指导,该参数用作内部大小调整的提示。 .... A value of one is appropriate when it is known that only one thread will modify and all others will only read. ....当已知只有一个线程将修改而所有其他线程将仅读取时,将值设为1是适当的。

该解决方案之所以有效,是因为对ConcurrentMaps的内存一致性影响:与其他并发集合一样,在将对象作为键或值放入ConcurrentMap中之前,线程中的操作在访问或从ConcurrentMap中删除该对象之后发生。在另一个线程中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM