简体   繁体   中英

Synchronized HashMap vs ConcurrentHashMap write test

I learn java.util.concurrency and I have found one article about performance ( http://www.javamex.com/tutorials/concurrenthashmap_scalability.shtml ). And I decided to repeat one little part of those performance test for studying purpose. I have written the write test for HashMap and ConcurrentHashMap . I have two questions about it:

  1. Is it true, that for the best performance I should use number threads equal number CPU core?
  2. I understand, that performance vary form platform to platform. But the result has shown that HashMap a little more faster then ConcurrentHashMap . I think it should be the same or vise versa. Maybe I have made a mistake in my code.

Any criticism is welcome.

package Concurrency;

import java.util.concurrent.*;
import java.util.*;

class Writer2 implements Runnable {
    private Map<String, Integer> map;
    private static int index;
    private int nIteration; 
    private Random random = new Random();
    char[] chars = "abcdefghijklmnopqrstuvwxyz".toCharArray();
    private final CountDownLatch latch;

    public Writer2(Map<String, Integer> map, int nIteration, CountDownLatch latch) {
        this.map = map;
        this.nIteration = nIteration;
        this.latch = latch;
    }

    private synchronized String getNextString() {
        StringBuilder sb = new StringBuilder();
        for (int i = 0; i < 5; i++) {
            char c = chars[random.nextInt(chars.length)];
            sb.append(c);
        }
        sb.append(index);
        if(map.containsKey(sb.toString()))
            System.out.println("dublicate:" + sb.toString());
        return sb.toString();
    }

    private synchronized int getNextInt() { return index++; }

    @Override
    public void run() {
        while(nIteration-- > 0) {
            map.put(getNextString(), getNextInt());
        }
        latch.countDown();
    }
}

public class FourtyTwo {
    static final int nIteration = 100000;
    static final int nThreads = 4;
    static Long testMap(Map<String, Integer> map) throws InterruptedException{
        String name = map.getClass().getSimpleName(); 
        CountDownLatch latch = new CountDownLatch(nThreads);
        long startTime = System.currentTimeMillis();
            ExecutorService exec = Executors.newFixedThreadPool(nThreads);
            for(int i = 0; i < nThreads; i++)
                exec.submit(new Writer2(map, nIteration, latch));
            latch.await();  
            exec.shutdown();
        long endTime = System.currentTimeMillis();
        System.out.format(name + ": that took %,d milliseconds %n", (endTime - startTime));
        return (endTime - startTime);
    }
    public static void main(String[] args) throws InterruptedException {
        ArrayList<Long> result = new ArrayList<Long>() {
            @Override
            public String toString() {
                Long result = 0L;
                Long size = new Long(this.size());
                for(Long i : this)
                    result += i;
                return String.valueOf(result/size);
            }
        }; 

        Map<String, Integer> map1 = Collections.synchronizedMap(new HashMap<String, Integer>());
        Map<String, Integer> map2 = new ConcurrentHashMap<>();

        System.out.println("Rinning test...");
        for(int i = 0; i < 5; i++) {
            //result.add(testMap(map1)); 
            result.add(testMap(map2));
        }
        System.out.println("Average time:" + result + " milliseconds");

    }

}

/*
OUTPUT:
ConcurrentHashMap: that took 5 727 milliseconds 
ConcurrentHashMap: that took 2 349 milliseconds 
ConcurrentHashMap: that took 9 530 milliseconds 
ConcurrentHashMap: that took 25 931 milliseconds 
ConcurrentHashMap: that took 1 056 milliseconds 
Average time:8918 milliseconds

SynchronizedMap: that took 6 471 milliseconds 
SynchronizedMap: that took 2 444 milliseconds 
SynchronizedMap: that took 9 678 milliseconds 
SynchronizedMap: that took 10 270 milliseconds 
SynchronizedMap: that took 7 206 milliseconds 
Average time:7213 milliseconds

*/

One

How many threads varies, not by CPU, but by what you are doing. If, for example, what you are doing with your threads is highly disk intensive, your CPU isn't likely to be maxed out, so doing 8 threads may just cause heavy thrashing. If, however, you have huge amounts of disk activity, followed by heavy computation, followed by more disk activity, you would benefit from staggering the threads, splitting out your activities and grouping them, as well as using more threads. You would, for example, in such a case, likely want to group together file activity that uses a single file, but maybe not activity where you are pulling from a bunch of files (unless they are written contiguously on the disk). Of course, if you overthink disk IO, you could seriously hurt your performance, but I'm making a point of saying that you shouldn't just shirk it, either. In such a program, I would probably have threads dedicated to disk IO, threads dedicated to CPU work. Divide and conquer. You'd have fewer IO threads and more CPU threads.

It is common for a synchronous server to run many more threads than cores/CPUs because most of those threads either do work for only a short time or don't do much CPU intensive work. It's not useful to have 500 threads, though, if you will only ever have 2 clients and the context switching of those excess threads hampers performance. It's a balancing act that often requires a little bit of tuning.

In short

  1. Think about what you are doing
    • Network activity is light,so more threads are generally good
    • CPU intensive things don't do much good if you have 2x more of those threads than cores... usually a little more than 1x or a little less than 1x is optimum, but you have to test, test, test
    • Having 10 disk IO intensive threads may hurt all 10 threads, just like having 30 CPU intensive threads... the thrashing hurts them all
  2. Try to spread out the pain
    • See if it helps to spread out the CPU, IO, etc, work or if clustering is better... it will depend on what you are doing
  3. Try to group things up
    • If you can, separate out your disk, IO, and network tasks and give them their own threads that are tuned to those tasks

Two

In general, thread-unsafe methods run faster. Similarly using localized synchronization runs faster than synchronizing the entire method. As such, HashMap is normally significantly faster than ConcurrentHashMap. Another example would be StringBuffer compared to StringBuilder. StringBuffer is synchronized and is not only slower, but the synchronization is heavier (more code, etc); it should rarely be used. StringBuilder, however, is unsafe if you have multiple threads hitting it. With that said, StringBuffer and ConcurrentHashMap can race, too. Being "thread-safe" doesn't mean that you can just use it without thought, particularly the way that these two classes operate. For example, you can still have a race condition if you are reading and writing at the same time (say, using contains(Object) as you are doing a put or remove). If you want to prevent such things, you have to use your own class or synchronize your calls to your ConcurrentHashMap.

I generally use the non-concurrent maps and collections and just use my own locks where I need them . You'll find that it's much faster that way and the control is great. Atomics (eg AtomicInteger) are nice sometimes, but really not generally useful for what I do. Play with the classes, play with synchronization, and you'll find that you can master than more efficiently than the shotgun approach of ConcurrentHashMap, StringBuffer, etc. You can have race conditions whether or not you use those classes if you don't do it right... but if you do it yourself, you can also be much more efficient and more careful.


Example

Note that we have a new Object that we are locking on. Use this instead of synchronized on a method.

public final class Fun {
    private final Object lock = new Object();

    /*
     * (non-Javadoc)
     *
     * @see java.util.Map#clear()
     */
    @Override
    public void clear() {
        // Doing things...
        synchronized (this.lock) {
            // Where we do sensitive work
        }
    }

    /*
     * (non-Javadoc)
     *
     * @see java.util.Map#put(java.lang.Object, java.lang.Object)
     */
    @Override
    public V put(final K key, @Nullable final V value) {
        // Doing things...
        synchronized (this.lock) {
            // Where we do sensitive work
        }
        // Doing things...
    }
}

And From Your Code...

I might not put that sb.append(index) in the lock or might have a separate lock for index calls, but...

    private final Object lock = new Object();

    private String getNextString() {
        StringBuilder sb = new StringBuilder();
        for (int i = 0; i < 5; i++) {
            char c = chars[random.nextInt(chars.length)];
            sb.append(c);
        }
        synchronized (lock) {
            sb.append(index);
            if (map.containsKey(sb.toString()))
                System.out.println("dublicate:" + sb.toString());
        }
        return sb.toString();
    }

    private int getNextInt() {
        synchronized (lock) {
            return index++;
        }
    }

The article you linked doesn't state that ConcurrentHashMap is generally faster than a synchronized HashMap, only that it scales better; ie it is faster for a large number of threads. As you can see in their graph, for 4 threads the performance is very similar.

Apart from that, you should test with more items, as my results show, they can vary quite a bit:

Rinning test...
SynchronizedMap: that took 13,690 milliseconds 
SynchronizedMap: that took 8,210 milliseconds 
SynchronizedMap: that took 11,598 milliseconds 
SynchronizedMap: that took 9,509 milliseconds 
SynchronizedMap: that took 6,992 milliseconds 
Average time:9999 milliseconds

Rinning test...
ConcurrentHashMap: that took 10,728 milliseconds 
ConcurrentHashMap: that took 7,227 milliseconds 
ConcurrentHashMap: that took 6,668 milliseconds 
ConcurrentHashMap: that took 7,071 milliseconds 
ConcurrentHashMap: that took 7,320 milliseconds 
Average time:7802 milliseconds

Note that your code isn't clearing the Map between loops, adding nIteration more items every time... is that what you wanted?

synchronized isn't necessary on your getNextInt/getNextString, as they aren't being called from multiple threads.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM