简体   繁体   中英

What are the limitations for hash functions with std::unordered_map

I am using std::unordered_map to represent data in a 3Dimensional spatial arrangement. My hashing function is:

unsigned int x,y,z;
unsigned int a =1000;
unsigned int b = 1000*a;
unsigned int Hash = x + a*y + b*z;

Which should allow for up to 1000 units of x, 1000 units of y, before there are any collisions. My question is, are there any limits to the collision-free space of my hash function? or can I set a and b to be large numbers, noting that this would easily exceed the memory of my system if it was all allocated?

cheers

First, some background on the operation of hashtables:

Hashtables don't allocate enough buckets to hold the entire space of the hash function. That would indeed be wasteful (and probably also impossible). They allocate a certain number of buckets (say 16) and then store each pair in the bucket that the key hashes to modulo the number of buckets.

The number of buckets is increased when the map reaches a certain threshold (usually 75-85%) of occupied buckets. This forces a rehash of all of the keys so they can be applied to a new modulo.

So if your hash function returns 50 for a particular key, and the hashtable has 16 buckets, the pair for that key is stored in bucket (50 mod 16) = 2.

If the number of buckets is later increased to 32, the pair gets moved to the bucket (50 mod 32) = 18.

can I set a and b to be large numbers

Absolutely, because the hash modulo the number of allocated buckets is used to find the bucket for a particular key.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM