简体   繁体   中英

Using a map to check if an id exists

I'm implementing a locking mecahnism and for this I need a fast lookup wether a given Id already is locked. Now I was thinking of using a map and I'm wondering if there is some better structure for this. Basically I don't really need the map, because there is no mapping done. However, If I would use a vector, I would have to do a linear search, which would become expensive for many entries.

Now I wonder if there is some structure that allows me a similar fast lookup without the additional overhead of storing etra data.

iE

std::map<IdType, bool> locked;

// Prevent deadlock by checking if this thread already locked. Otherwise
// it can pass through.
if(locked.find(Id) != locked.end())
    lock();

As you can see, I don't really need the mapped value. I know that for std::vector , using a bool , it is compressed to bits. Now I wonder if I waste a lot of memory for maintaining those bools while I don't even need them anyway. Would a char be better or some other structure which just gives me the key lookup without the extra data?

If you have C++0x, you could use std::unordered_set , average lookup is O(1) .

From the cppreference.com documentation (emphasis mine):

... Search, insertion, and removal have average constant-time complexity .

Internally, the elements are not sorted in any particular order, but organized into buckets. Which bucket an element is placed into depends entirely on the hash of its value. This allows fast access to individual elements , since once a hash is computed, it refers to the exact bucket the element is placed into.

If you don't have C++0x, unordered_set should be in TR1 :

#include <tr1/unordered_set>
std::tr1::unordered_set<IdType> locked;

You could also use an unordered_map , but I guess the readers of you code would have a hard time understanding what the mapped value is used for.


PS: And keep the Rules Of Optimization in mind ;)

You can use std::vector<bool> or boost::dynamic_bitset under following conditions:

  1. IdType is an integral type

  2. All id values fit inside a short enough range. The memory usage will be (length of that range)/8 , which can be a couple of orders of magnitude less than would be consumed by a std::unordered_set<int> or std::set<int> containing all elements from that range.

  3. You don't have to iterate over elements of your set (just insert/remove/check presence), or iteration occurs infrequently and the performance emphasis is on insertion/removal/containment-testing operations.

In such situations a dynamic bitset is a more proper data structure (both faster and more memory efficient).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM