简体   繁体   中英

HashMap || How hashcode lookup is constant

In a hashmap, how is the hashcode lookup a constant O(1) ?

We know that internally a hashmap creates an array to hold a hash-code for a given key-value. With the use of hashing function, hashmap generates hash code. We also know that for lookup, the hashmap takes constant time (assuming there is no collision). Whenever we request a hashmap to look for a value for a given key it first calculates the bucket location (ie index of the array which is mapped with hashcode of the given key). Then it fetches the value. I understand the second part will take constant time. But what about the first part? How is the lookup of the array index for hashcode constant? Especially when a hashmap has millions of values?

My StackOverflow search found multiple question on hashmap, but mostly they answered the second part of my question and not for the first part.

Few of the links I found:

  1. Why hashmap lookup is O(1) ie constant time?
  2. Can hash tables really be O(1)

I also found this question posted by a user at javarevisited.blogspot :

Hi Javin, Need a clarification on one of my recent interview question. For search and sort which Collection datastructure to prefer : ArrayList or LinkedList. I mentioned ArrayList would be the choice for retrieval operations as it implements Random Access whereas Linked list would be a better choice for insertion / deletion as it holds pointers for before and after node. My followup question, so do you mean to say retrieval is faster using an Arraylist which holds 1million records ? I said if index is known we can use the contains() and get the value. But clarify me on this 1million scenario in real dynamic case ie without knowing index. Would ArrayList be still faster ?

You seem to have a misunderstanding about datastructures. When you create an array, that array has a space in memory saved. The size of that space is the number of elements in the array multiplied by the size per element .

Therefore, an array holding eight 2-byte numbers would be 16 bytes.

Lets say we want the number at the fourth index: we can look up this number without iteration because we know something about the nature of the data structure: specifically where it starts and the size of each element. In this case we know that if we multiply 2 bytes by 3 (3 = 4 - 1: remember we are zero-indexed), we get 6 and the start of the element we want is 6 bytes past the beginning of our array.

Hashmaps are usually backed by arrays of this nature. The calculation of where the desired array element starts is more complicated, but it can be done without iteration . Therefore it is O(1) . The value that is found in the array location is the actual place in memory of the value that is retrieved.

  • The function calculating the location in the backing array (where the memory location of the value is stored), using the key provided, is a constant time operation.
  • Reading the memory location calculated in the first step is a constant time operation.
  • Reading the memory location stored in the location read in the second step is a constant time operation.

Thus, the entire operation happens in constant time.

Array lookup at a given index is done in constant time. Actually, it is a simple address computation (base + index * stride) followed by indirection.

When you find hashcode, you can find cell number for constant time too

cellIndex = hash(X) % array.length

So you have const time at all

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM