简体   繁体   中英

HashMap vs LinkedHashMap performance in iteration over values()

HashMapLinkedHashMap在遍历values()函数时是否存在性能差异?

I think the LinkedHashMap has to be faster in traversal due to a superior nextEntry implementation in its Iterator

Here is why :

Let us go step by step from the values implementation.
The HashMap implementation of values is this :

public Collection<V> values() {
    Collection<V> vs = values;
    return (vs != null ? vs : (values = new Values()));
}

The LinkedHashMap extends from HashMap and inherits the same implementation.

The difference is in the Iterator implementation for the Values in both.

for HashMap it extends from java.util.HashMap.HashIterator

private final class ValueIterator extends HashIterator<V> {
    public V next() {
        return nextEntry().value;
    }
}

but for LinkedHashMap it extends from java.util.LinkedHashMap.LinkedHashIterator

private class ValueIterator extends LinkedHashIterator<V> {
    public V next() { return nextEntry().value; }
}

so the difference essentially boils down to nextEntry implementation.

For LinkedHashMap it is just calling e.after where e is the Entry , but for HashMap there is some work involved in traversing the Entry[] array to find the next next.

UPDATE : Code for nextEntry() in HashMap

final Entry<K,V> nextEntry() {
    if (modCount != expectedModCount)
        throw new ConcurrentModificationException();
    Entry<K,V> e = next;
    if (e == null)
        throw new NoSuchElementException();

    if ((next = e.next) == null) {
        Entry[] t = table;
        while (index < t.length && (next = t[index++]) == null)
            ;
    }
    current = e;
    return e;
}

The Entry[] is not a contiguous store. (There could be null values in between). If you take a look at the above code, what it does is point next to current and find the next next by iterating over the Entry[] .

But I think this performance gain will come at the cost of insertion. Check out the addEntry method in both classes as an exercise.

I wrote a little profiling program creating 1 million keys (Integer) vs Boolean.TRUE, repeating 100 times. Found the following:

HashMap:-
Create:  3.7sec
Iterate: 1.1sec
Access:  1.5sec
Total:   6.2sec

LinkedHashMap:-
Create:  4.7sec   (30% slower)
Iterate: 0.5sec   (50% faster)
Access:  0.8sec   (50% faster)
Total :  6.0sec

Garbage collection NOT done so pollutes the numbers somewhat, however I think LinkedHashMap has the edge over HashMap and I will be using that in future code.

It almost does not matter. The question is: what do you need. If order of elements is relevant you have to use LinkedHashMap . Otherwise you just do not need it, so use HashMap .

The best advice would be "Don't be afraid to try it out" but I'm quite sure they are very similar. Getter for the value set is O(1) and so is each iterator step. Iterating through a linked list is as trivial as iterating through the hash buckets, with a possible small edge in favor of the linked list.

是的,在HashMapLinkedHashMap所有迭代中都会有相同的性能差异: HashMap花费的时间与条目数加上哈希表的大小成正比,而LinkedHashMap花费的时间与条目数成正比.

我在 UnitTest 中尝试过,迭代 values() 10000 次,毫秒数:806 vs 902。几乎相同。

Code...

import java.lang.management.GarbageCollectorMXBean;
import java.lang.management.ManagementFactory;
import java.util.HashMap;
import java.util.HashSet;
import java.util.LinkedHashMap;
import java.util.Map;

public class MapTest
{
    public static void main(String[] args)
    {
        int iterations = 1000000;

        long time1, time2, time3;
        System.nanoTime(); //Just to make sure any possible overhead is done...though there shouldn't really be any

        int sequential[] = new int[iterations]; //Counting up from 0
        int random[] = new int[iterations]; //Same set of values, but randomized (no duplicates)
        HashSet<Integer> addedRandoms = new HashSet<>();
        for (int i = 0; i < iterations; i++)
        {
            sequential[i] = i;

            int randomVal = random(iterations);
            while (addedRandoms.contains(randomVal)) randomVal = random(iterations); //Get another random instead of sequentially finding another unused value, to prevent clumping
            addedRandoms.add(randomVal);
            random[i] = random(iterations);
        }

        HashMap<Integer, Integer> hashMap = new HashMap<>();
        LinkedHashMap<Integer, Integer> linkedHashMap = new LinkedHashMap<>();


        int gcRuns = 0, prevGCRuns;
        for (GarbageCollectorMXBean gcBean : ManagementFactory.getGarbageCollectorMXBeans()) gcRuns += gcBean.getCollectionCount();
        prevGCRuns = gcRuns;


        //Test
        time1 = System.nanoTime();
        for (int i : sequential) hashMap.put(i, 0);
        time2 = System.nanoTime();
        for (int i : sequential) linkedHashMap.put(i, 0);
        time3 = System.nanoTime();

        prevGCRuns = printAndReset("Put: sequential key (from 0 to " + (iterations - 1) + "), no overwrites", time1, time2, time3, prevGCRuns);


        //Test
        time1 = System.nanoTime();
        for (int i : random) hashMap.put(i, 0);
        time2 = System.nanoTime();
        for (int i : random) linkedHashMap.put(i, 0);
        time3 = System.nanoTime();

        prevGCRuns = printAndReset("Put: random key (between 0 and " + (iterations - 1) + " inclusive), all overwrites (exactly one per entry, random order)", time1, time2, time3, prevGCRuns);


        //Attempt GC
        System.gc();
        prevGCRuns = 0;
        for (GarbageCollectorMXBean gcBean : ManagementFactory.getGarbageCollectorMXBeans()) prevGCRuns += gcBean.getCollectionCount();


        //Test
        time1 = System.nanoTime();
        for (int i : sequential) hashMap.put(i, 0);
        time2 = System.nanoTime();
        for (int i : sequential) linkedHashMap.put(i, 0);
        time3 = System.nanoTime();

        prevGCRuns = printAndReset("Put: sequential key (from 0 to " + (iterations - 1) + "), all overwrites (exactly one per entry, sequential order)", time1, time2, time3, prevGCRuns);


        //Empty maps
        hashMap = new HashMap<>();
        linkedHashMap = new LinkedHashMap<>();


        //Attempt GC
        System.gc();
        prevGCRuns = 0;
        for (GarbageCollectorMXBean gcBean : ManagementFactory.getGarbageCollectorMXBeans()) prevGCRuns += gcBean.getCollectionCount();


        //Test
        time1 = System.nanoTime();
        for (int i : random) hashMap.put(i, 0);
        time2 = System.nanoTime();
        for (int i : random) linkedHashMap.put(i, 0);
        time3 = System.nanoTime();

        prevGCRuns = printAndReset("Put: random key (between 0 and " + (iterations - 1) + " inclusive), no overwrites", time1, time2, time3, prevGCRuns);


        //Test
        time1 = System.nanoTime();
        for (int i : sequential) hashMap.get(i);
        time2 = System.nanoTime();
        for (int i : sequential) linkedHashMap.get(i);
        time3 = System.nanoTime();

        prevGCRuns = printAndReset("Sequential get, randomized internal keys", time1, time2, time3, prevGCRuns);


        //Test
        time1 = System.nanoTime();
        for (int i : random) hashMap.get(i);
        time2 = System.nanoTime();
        for (int i : random) linkedHashMap.get(i);
        time3 = System.nanoTime();

        prevGCRuns = printAndReset("Random get, randomized internal keys", time1, time2, time3, prevGCRuns);


        //Set sequential keys
        hashMap = new HashMap<>();
        linkedHashMap = new LinkedHashMap<>();
        for (int i : sequential)
        {
            hashMap.put(i, 0);
            linkedHashMap.put(i, 0);
        }


        //Attempt GC
        System.gc();
        prevGCRuns = 0;
        for (GarbageCollectorMXBean gcBean : ManagementFactory.getGarbageCollectorMXBeans()) prevGCRuns += gcBean.getCollectionCount();


        //Test
        time1 = System.nanoTime();
        for (int i : sequential) hashMap.get(i);
        time2 = System.nanoTime();
        for (int i : sequential) linkedHashMap.get(i);
        time3 = System.nanoTime();

        prevGCRuns = printAndReset("Sequential get, sequential internal keys", time1, time2, time3, prevGCRuns);


        //Test
        time1 = System.nanoTime();
        for (int i : random) hashMap.get(i);
        time2 = System.nanoTime();
        for (int i : random) linkedHashMap.get(i);
        time3 = System.nanoTime();

        prevGCRuns = printAndReset("Random get, sequential internal keys", time1, time2, time3, prevGCRuns);


        //Test
        time1 = System.nanoTime();
        for (int i : hashMap.values()) ;
        time2 = System.nanoTime();
        for (int i : linkedHashMap.values()) ;
        time3 = System.nanoTime();

        prevGCRuns = printAndReset("values() iteration, sequential internal keys", time1, time2, time3, prevGCRuns);


        //Set random keys
        hashMap = new HashMap<>();
        linkedHashMap = new LinkedHashMap<>();
        for (int i : random)
        {
            hashMap.put(i, 0);
            linkedHashMap.put(i, 0);
        }


        //Attempt GC
        System.gc();
        prevGCRuns = 0;
        for (GarbageCollectorMXBean gcBean : ManagementFactory.getGarbageCollectorMXBeans()) prevGCRuns += gcBean.getCollectionCount();


        //Test
        time1 = System.nanoTime();
        for (int i : hashMap.values()) ;
        time2 = System.nanoTime();
        for (int i : linkedHashMap.values()) ;
        time3 = System.nanoTime();

        prevGCRuns = printAndReset("values() iteration, randomized internal keys", time1, time2, time3, prevGCRuns);


        //Test
        time1 = System.nanoTime();
        for (int i : hashMap.keySet()) ;
        time2 = System.nanoTime();
        for (int i : linkedHashMap.keySet()) ;
        time3 = System.nanoTime();

        prevGCRuns = printAndReset("keySet() iteration, randomized internal keys", time1, time2, time3, prevGCRuns);


        //Set sequential keys
        hashMap = new HashMap<>();
        linkedHashMap = new LinkedHashMap<>();
        for (int i : sequential)
        {
            hashMap.put(i, 0);
            linkedHashMap.put(i, 0);
        }


        //Attempt GC
        System.gc();
        prevGCRuns = 0;
        for (GarbageCollectorMXBean gcBean : ManagementFactory.getGarbageCollectorMXBeans()) prevGCRuns += gcBean.getCollectionCount();


        //Test
        time1 = System.nanoTime();
        for (int i : hashMap.keySet()) ;
        time2 = System.nanoTime();
        for (int i : linkedHashMap.keySet()) ;
        time3 = System.nanoTime();

        prevGCRuns = printAndReset("keySet() iteration, sequential internal keys", time1, time2, time3, prevGCRuns);


        //Test
        time1 = System.nanoTime();
        for (Map.Entry<Integer, Integer> entry : hashMap.entrySet()) ;
        time2 = System.nanoTime();
        for (Map.Entry<Integer, Integer> entry : linkedHashMap.entrySet()) ;
        time3 = System.nanoTime();

        prevGCRuns = printAndReset("entrySet() iteration, sequential internal keys", time1, time2, time3, prevGCRuns);


        //Set random keys
        hashMap = new HashMap<>();
        linkedHashMap = new LinkedHashMap<>();
        for (int i : random)
        {
            hashMap.put(i, 0);
            linkedHashMap.put(i, 0);
        }


        //Attempt GC
        System.gc();
        prevGCRuns = 0;
        for (GarbageCollectorMXBean gcBean : ManagementFactory.getGarbageCollectorMXBeans()) prevGCRuns += gcBean.getCollectionCount();


        //Test
        time1 = System.nanoTime();
        for (Map.Entry<Integer, Integer> entry : hashMap.entrySet()) ;
        time2 = System.nanoTime();
        for (Map.Entry<Integer, Integer> entry : linkedHashMap.entrySet()) ;
        time3 = System.nanoTime();

        prevGCRuns = printAndReset("entrySet() iteration, randomized internal keys", time1, time2, time3, prevGCRuns);
    }


    protected static int printAndReset(String description, long time1, long time2, long time3, int prevGCRuns)
    {
        System.out.println(description);
        System.out.println("HashMap: " + (time2 - time1) + " nanos");
        System.out.println("LinkedHashMap: " + (time3 - time2) + " nanos");
        int gcRuns = 0;
        for (GarbageCollectorMXBean gcBean : ManagementFactory.getGarbageCollectorMXBeans()) gcRuns += gcBean.getCollectionCount();
        System.out.println("GC during test: " + (gcRuns != prevGCRuns));
        System.out.println();

        return gcRuns;
    }


    public static int random(int maxvalue)
    {
        return (int) ((double) maxvalue * Math.random());
    }
}

Output...

Put: sequential key (from 0 to 999999), no overwrites
HashMap: 30190070 nanos
LinkedHashMap: 44618672 nanos
GC during test: false

Put: random key (between 0 and 999999 inclusive), all overwrites (exactly one per entry, random order)
HashMap: 118536111 nanos
LinkedHashMap: 110828524 nanos
GC during test: false

Put: sequential key (from 0 to 999999), all overwrites (exactly one per entry, sequential order)
HashMap: 25070771 nanos
LinkedHashMap: 23569874 nanos
GC during test: false

Put: random key (between 0 and 999999 inclusive), no overwrites
HashMap: 93353708 nanos
LinkedHashMap: 106686445 nanos
GC during test: false

Sequential get, randomized internal keys
HashMap: 38817600 nanos
LinkedHashMap: 39499373 nanos
GC during test: false

Random get, randomized internal keys
HashMap: 42636179 nanos
LinkedHashMap: 51062653 nanos
GC during test: false

Sequential get, sequential internal keys
HashMap: 19986540 nanos
LinkedHashMap: 19502021 nanos
GC during test: false

Random get, sequential internal keys
HashMap: 58083205 nanos
LinkedHashMap: 59547574 nanos
GC during test: false

values() iteration, sequential internal keys
HashMap: 21284921 nanos
LinkedHashMap: 18383069 nanos
GC during test: false

values() iteration, randomized internal keys
HashMap: 19616868 nanos
LinkedHashMap: 15392964 nanos
GC during test: false

keySet() iteration, randomized internal keys
HashMap: 18054895 nanos
LinkedHashMap: 16067725 nanos
GC during test: false

keySet() iteration, sequential internal keys
HashMap: 18764430 nanos
LinkedHashMap: 18604873 nanos
GC during test: false

entrySet() iteration, sequential internal keys
HashMap: 18493825 nanos
LinkedHashMap: 18067752 nanos
GC during test: false

entrySet() iteration, randomized internal keys
HashMap: 16252707 nanos
LinkedHashMap: 13175517 nanos
GC during test: false

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM