What are the limitations for hash functions with std::unordered_map

Question

I am using std::unordered_map to represent data in a 3Dimensional spatial arrangement. My hashing function is:

unsigned int x,y,z;
unsigned int a =1000;
unsigned int b = 1000*a;
unsigned int Hash = x + a*y + b*z;

Which should allow for up to 1000 units of x, 1000 units of y, before there are any collisions. My question is, are there any limits to the collision-free space of my hash function? or can I set a and b to be large numbers, noting that this would easily exceed the memory of my system if it was all allocated?

cheers

Answer 1

First, some background on the operation of hashtables:

Hashtables don't allocate enough buckets to hold the entire space of the hash function. That would indeed be wasteful (and probably also impossible). They allocate a certain number of buckets (say 16) and then store each pair in the bucket that the key hashes to modulo the number of buckets.

The number of buckets is increased when the map reaches a certain threshold (usually 75-85%) of occupied buckets. This forces a rehash of all of the keys so they can be applied to a new modulo.

So if your hash function returns 50 for a particular key, and the hashtable has 16 buckets, the pair for that key is stored in bucket (50 mod 16) = 2.

If the number of buckets is later increased to 32, the pair gets moved to the bucket (50 mod 32) = 18.

can I set a and b to be large numbers

Absolutely, because the hash modulo the number of allocated buckets is used to find the bucket for a particular key.

What are the limitations for hash functions with std::unordered_map

Question

1 answers

solution1
1 ACCPTED 2017-09-15 03:43:40

What are the limitations for hash functions with std::unordered_map

Question

1 answers

solution1 1 ACCPTED 2017-09-15 03:43:40

solution1
1 ACCPTED 2017-09-15 03:43:40