简体   繁体   中英

What is the relation between hashcode and hash-based collections (i.e. HashMap and HashSet)?

I have read here ,

An object's hash code allows algorithms and data structures to put objects into compartments, just like letter types in a printer's type case. The printer puts all “A” types into the compartment for “A”, and he looks for an “A” only in this one compartment. This simple system lets him find types much faster than searching in an unsorted drawer. That's also the idea of hash-based collections, such as HashMap and HashSet.

But I am not getting it. What idea is being referred here? How is a HashMap or HashSet related to the hash code?


EDIT:

Take a HashSet for an example.

Say it stores 7 objects (let's represent them by letters) and they could be anything from A to E. So say it has A at 0th index position, B at 1st position, at second position, A at the third index, then A again at the fourth index, and then D at the fifth index, and E at the sixth. and so on. This is one form of storage.

hashSet => [A, B, C, A, A, D, E]

Then there is another form of storage, in which all the three A`s go in one compartment, then B in another, C in another and D and E in another two compartments.

hashCode-based classification => [A, A, A], [B], [C], [D], [E]

Now if I check hashSet.contains('A') , it will directly look into the compartment containing A which is the first compartment, and compute A.equals(member-of-compartment) for each member of the compartment?

Am I right?

A HashMap or HashSet is like the drawer with compartments. These data structures use the hashCode to put objects in the right compartment.

A very simple hashCode for Strings (not the one used by Java, but the one used a lot in real life) is the first letter of the String. That's what a lot of people use to put things in the right compartment.

So both a HashMap and a HashSet first look up the right compartment to search for the object, based on the hashCode. Once they have the right compartment, they look through it from the front to the back to find the right object (but ideally, there are only few objects in each compartment)

EDIT :

You're right about HashSet in your edit. (If you implemented the hashCode function to always return the same value, say zero, then it would work like "hashSet => [A, B, C, A, A, D, E]" - it would put everything into one bucket/compartment)

Note that there is very little difference between a HashMap and a HashSet in how they're implemented. A HashSet is just a HashMap that only uses keys and uses a single pre-defined value to indicate that the key is present in the map.

Hash-based collections have what are normally referred to as buckets . All objects with the same hash code go into the same bucket. This is the same idea as the printer's type case example. A hash-based collection's bucket is like the printer's type case compartment.

All "A"s go into the same compartment just like all objects with the same hash code go into the same bucket.

The printer decides which compartment based on which letter it is, regardless of type. The "hash code" operation here is which letter the type is.

Similarly, hash-based collections need to find which bucket an object belongs to. They do this by getting the object's hash code.

Just like a printer just needs to look in the "A" compartment to look for the specific "A" needed, a hash-based collection calls the hashCode() method to determine which bucket to look in to find an object.

The printer still needs to visually inspect each object in the compartment for what he/she needs. Likewise, a hash-based collection still needs to find the correct object, even if multiple objects are in the same bucket. The hash-based collection calls equals to see if the key to the object exists.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM