简体   繁体   中英

std::map with std::vector as key -- complexity of lookup function

I have a set of N customers, indexed 0,...,N-1 . Periodically, for some subset S of customers, I need to evaluate a function f(S) . Computing f(S) is of linear complexity in |S|. The set S of customers is represented as an object of type std::vector<int> . The subsets that come up for evaluation can be of different size each time. [Since the order of customers in S does not matter, the set can as well be represented as an object of type std::set<int> or std::unordered_set<int> .]

In the underlying application, I may have the same subset S of customers come up multiple times for evaluation of f(S) . Instead of incurring the needless linear complexity each time, I am looking to see if it would benefit from some sort of less computational burdensome lookup.

I am considering having a map of key-value pairs where the key is directly the vector of customers, std::vector<int> S and the value mapped to this key is f(S) . That way, I am hoping that I can first check to see if a key already exists in the map, and if it does, I can look it up without having to compute f(.) again.

Having an std::map with std::vector as keys is well-defined. See, for eg, here .

CPPReference indicates that map lookup time is logarithmic. But I suppose this is logarithmic in the number of key s where each key if of a constant length -- such as an int or a double , etc. How is the complexity affected where the key itself need not be of constant length and can be of arbitrary length upto size N ?

Since the keys can themselves be of different sizes (subset of customers that come up for evaluation could be different each time), does this introduce any additional complexity in computing a hash function or the compare operation for the std::map ? Is there any benefit to maintain the key as a binary array a fixed length N ? This binary array is such that B_S[i]=1 if the i th customer is in set S , and it is 0 otherwise. Does this make the lookup any easier?

I am aware that ultimately the design choice between reevaluating f(S) each time versus using std::map would have to be done based on actual profiling of my application. However, before implementing both ideas (the std::map route is more difficult to code in my underlying application), I would like to know if there are any known pre-existing best-practices / benchmarks.

Complexity of lookup in a map is O(log N) That is, roughly log N comparisons are needed when there are N elements in the map. The cost of the comparison itself adds to that linearly. For example when you compare M vectors with K elements, then there are roughly log N comparisons, each comparing M*K vector elements, ie in total O(M*K*log N) .

However, asymptotic complexity is only that: Asymptotic complexity. When there are only a small number of elements in the map then lower order factors might outweigh the log N that only dominates for large N . Consequently, the actual runtime depends on your specific application and you need to measure to be sure.

Moreover, you shouldn't use vectors as keys in the first place. Its a waste of memory. Subsets of S can be enumerated with a n-bit integer when S has n elements (simply set the i-th bit when i-th element of S is in the subset). Comparing a single integer (or bitset) is surely more efficient than comparing vectors of integers.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM