简体   繁体   中英

Why is my std::unordered_map access time not constant

I wrote some code to test my unordered map performance with a 2 component vector as a key.

std::unordered_map<Vector2i, int> m;                                                                      

for(int i = 0; i < 1000; ++i)                                                                             
    for(int j = 0; j < 1000; ++j)                                                                         
        m[Vector2i(i,j)] = i*j+27*j;                                                                      

clock.restart();                                                                                          

auto found = m.find(Vector2i(0,5));                                                                                                                                                            

std::cout << clock.getElapsedTime().asMicroseconds() << std::endl;                                         

output for the code above: 56 (microseconds) When I replace 1000 in the for loops by 100 the outputs is 2 (microseconds) Isn't the time supposed to be constant ?

hash function for my Vector2i:

namespace std                                                                                                    
{

   template<>                                                                                                   
    struct hash<Vector2i>                                                                                        
    {                                                                                                            
        std::size_t operator()(const Vector2i& k) const                                                          
        {                                                                                                        
            using std::size_t;                                                                                   
            using std::hash;                                                                                     
            using std::string;                                                                                   

            return (hash<int>()(k.x)) ^ (hash<int>()(k.y) << 1);                                                 
        }                                                                                                        

    };                                                                                                           


}                                                                             

EDIT: I added this code to count the collisions after the for loop:

for (size_t bucket = 0; bucket != m.bucket_count(); ++bucket)                                             
    if (m.bucket_size(bucket) > 1)                                                                        
         ++collisions; 

With 100*100 elements: collisions = 256

1000*1000 elements: collisions = 2048

A hash table guarantees constant amortized time . If the hash table is well balanced (ie, the hash function is good), then most elements will be evenly distributed. However, if the hash function is not so good, you may have lots of collisions, in which case to access an element you'd need to traverse usually a linked list (where you store the elements that collided). So make sure first the load factor and hash function are OK in your case. Lastly, make sure you compiler your code in release mode, with optimizations turned on (eg -O3 for g++/clang++).

This question may be useful also: How to create a good hash_combine with 64 bit output (inspired by boost::hash_combine) .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM