简体   繁体   中英

Hashtable Implementation

I was recently asked 'how would you implement a hastable'. I know the hashing algorithm is critical as the less collisions the better WRT performance, but what algorithm/data structure should be employed to deliver amortized constant time {O(1)} for insert/delete/lookups?

Hash tables have two main possibilities:

  1. Open Addressing , which is a simple array [dynamic array actualy if you can let your table grow on the fly]. Once a conflict has met - you need to use a second hash function to find the next entree that the element will be mapped to. Note this solution has some troubles [which can be solved] when your hash table also allows deletions. [Special mark for "deleted" entrees]
  2. Chaining - in this solution, each entree in the array is a linked list - containig all elements hashed to this entree. In here - all elements mapped to a certain value are in the list.

The important part about hash tables [in both solutions] in order to allow armotorized O(1) insert/del/look up - is allocating a bigger table and rehashing once a pre defined load factor was reached.

EDIT: complexity analsis:
Assume a load factor of p for some p < 1 .

  1. The probability of "collision" in each access is p Thus the mean of array accesses is: Sigma(i * p^(i-1) * (1-p)) for each i in [1,n] This gives you: Sigma(i * p^(i-1) * (1-p)) = (1-p) * Sigma(i * p^(i-1)) <= (1-p) * 1/(p-1)^2 = 1-p = CONST . [have a look at the correctness of Sigma(i * p^(i-1)) < 1/(p-1)^2 in wolfram alpha ]. Thus resulting on average a constant number of accesses to the array. Also: You might need to rehash all elements after the load factor was reached, resulting in O(n) accesses to the array. This results in n * O(1) ops [adding n elements] and 1 * O(n) op [rehashing], so you get: (n * O(1) + 1 * O(n)) / n = O(1) armotorized time.
  2. Very similar to (1), but the analysis is done on list accesses. Each operation requires exactly one array accesses, and a variant number of list accesses - with the same analysis as before.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM