简体   繁体   中英

Can two keys having different hashCode be a part of same bucket in HashMap in Java?

I have a HashMap. There are 16 buckets in it (by default). Now is it possible that two keys having different hashCodes be part of the same bucket? Or is it always a new bucket is created for a different hashCode and in this way the HashMap expands the bucket size?

Read many posts, but only confused myself.

Yes, it is possible. Since the number of buckets is much smaller than the number of possible hashCodes (the number of buckets is proportional to the number of entries in the HashMap while the number of possible hashCodes is the number of possible int values, which is much larger), the final mapping of a hashCode to a bucket is done by some modulus operator, so multiple hashCodes may be mapped to the same bucket (if, for example, you have 16 buckets, both the hashCodes 1 and 17 will be mapped to the same bucket (note that by hashCode I don't mean the value returned by the hashCode method, since HashMap applies an additional function on that hashCode in order to improve the distribution of the hash codes)).

That's why hashCode alone is not enough to determine if the key we are looking for is present in the map - we have to use equals as well.

Taken from How HashMap works in Java :

Since the internal array of HashMap is of fixed size, and if you keep storing objects, at some point of time hash function will return same bucket location for two different keys, this is called collision in HashMap. In this case, a linked list is formed at that bucket location and a new entry is stored as next node.

And then when there if we want to get that object from the list we need equals() :

If we try to retrieve an object from this linked list, we need an extra check to search correct value, this is done by equals() method. Since each node contains an entry, HashMap keeps comparing entry's key object with the passed key using equals() and when it return true, Map returns the corresponding value.

hashcode() returns interger in java so you have to map integer range to bucket size. If you are mapping from bigger set to a smaller set so you will always have collisions. If you look at HashMap source code you will find following method to map int to bucket length.

 static int indexFor(int h, int length) {
             return h & (length-1);
 } 

The hash code is preprocessed to produce uniform distribution using:

static int hash(int h) {
         // This function ensures that hashCodes that differ only by
         // constant multiples at each bit position have a bounded
         // number of collisions (approximately 8 at default load factor).
         h ^= (h >>> 20) ^ (h >>> 12);
         return h ^ (h >>> 7) ^ (h >>> 4);
     }

Applies a supplemental hash function to a given hashCode, which defends against poor quality hash functions. This is critical because HashMap uses power-of-two length hash tables, that otherwise encounter collisions for hashCodes that do not differ in lower bits. Note: Null keys always map to hash 0, thus index 0.

HashMap source

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM