Why is Collections.synchronizedSet(HashSet) faster than HashSet for addAll, retainAll, and contains?

Question

I ran a test to find the best concurrent Set implementation for my program, with a non-synchronized HashSet as a control, and ran into an interesting result: the addAll , retainAll , and contains operations for a Collections.synchronizedSet(HashSet) appear to be faster than those of a regular HashSet . My understanding is that a SynchronizedSet(HashSet) should never be faster than a HashSet because it consists of a HashSet with synchronization locks. I've run the test quite a few times now, with similar results. Am I doing something wrong?

Relevant results:

Testing set: HashSet
Add: 17.467758 ms
Retain: 28.865039 ms
Contains: 22.18998 ms
Total: 68.522777 ms
--
Testing set: SynchronizedSet
Add: 17.54269 ms
Retain: 20.173502 ms
Contains: 19.618188 ms
Total: 57.33438 ms

Relevant code:

public class SetPerformance {
    static Set<Long> source1 = new HashSet<>();
    static Set<Long> source2 = new HashSet<>();
    static Random rand = new Random();
    public static void main(String[] args) {
        Set<Long> control = new HashSet<>();
        Set<Long> synch = Collections.synchronizedSet(new HashSet<Long>());

        //populate sets to draw values from
        System.out.println("Populating source");
        for(int i = 0; i < 100000; i++) {
            source1.add(rand.nextLong());
            source2.add(rand.nextLong());
        }

        //populate sets with initial values
        System.out.println("Populating test sets");
        control.addAll(source1);
        synch.addAll(source1);

        testSet(control);
        testSet(synch);
    }

    public static void testSet(Set<Long> set) {
        System.out.println("--\nTesting set: " + set.getClass().getSimpleName());
        long start = System.nanoTime();
        set.addAll(source1);
        long add = System.nanoTime();
        set.retainAll(source1);
        long retain = System.nanoTime();
        boolean test;
        for(int i = 0; i < 100000; i++) {
            test = set.contains(rand.nextLong());
        }
        long contains = System.nanoTime();
        System.out.println("Add: " + (add - start) / 1000000.0 + " ms");
        System.out.println("Retain: " + (retain - add) / 1000000.0 + " ms");
        System.out.println("Contains: " + (contains - retain) / 1000000.0 + " ms");
        System.out.println("Total: " + (contains - start) / 1000000.0 + " ms");
    }
}

Answer 1

You aren't warming up the JVM.

Note that you run the HashSet test first.

I changed your program slightly to run the test in a loop 5 times. SynchronizedSet was faster, on my machine, in only the first test.
Then, I tried reversing the order of the two tests, and only running the test once. HashSet won again.

Read more about this here: How do I write a correct micro-benchmark in Java?

Additionally, check out Google Caliper for a framework that handles all these microbenchmarking issues.

Answer 2

yes try to run the sync set before the regular and you will get your "needed" results. I reckon this has to do with the JVM warm up and nothing else. Try to warn up the VM with some computations and then run the benchmark or run it a few times in a mixed order.

Why is Collections.synchronizedSet(HashSet) faster than HashSet for addAll, retainAll, and contains?

Question

2 answers

solution1
3 ACCPTED 2014-07-31 19:13:04

You aren't warming up the JVM.

solution2
0 2014-07-31 19:14:43

Why is Collections.synchronizedSet(HashSet) faster than HashSet for addAll, retainAll, and contains?

Question

2 answers

solution1 3 ACCPTED 2014-07-31 19:13:04

You aren't warming up the JVM.

solution2 0 2014-07-31 19:14:43

solution1
3 ACCPTED 2014-07-31 19:13:04

solution2
0 2014-07-31 19:14:43