I have a set of N
customers, indexed 0,...,N-1
. Periodically, for some subset S
of customers, I need to evaluate a function f(S)
. Computing f(S)
is of linear complexity in |S|
. The set S
of customers is represented as an object of type std::vector<int>
. The subsets that come up for evaluation can be of different size each time. [Since the order of customers in S
does not matter, the set can as well be represented as an object of type std::set<int>
or std::unordered_set<int>
.]
In the underlying application, I may have the same subset S
of customers come up multiple times for evaluation of f(S)
. Instead of incurring the needless linear complexity each time, I am looking to see if it would benefit from some sort of less computational burdensome lookup.
I am considering having a map of key-value pairs where the key is directly the vector of customers, std::vector<int> S
and the value mapped to this key is f(S)
. That way, I am hoping that I can first check to see if a key already exists in the map, and if it does, I can look it up without having to compute f(.)
again.
Having an std::map
with std::vector
as keys is well-defined. See, for eg, here .
CPPReference indicates that map lookup time is logarithmic. But I suppose this is logarithmic in the number of key
s where each key if of a constant length -- such as an int
or a double
, etc. How is the complexity affected where the key itself need not be of constant length and can be of arbitrary length upto size N
?
Since the keys can themselves be of different sizes (subset of customers that come up for evaluation could be different each time), does this introduce any additional complexity in computing a hash function or the compare operation for the std::map
? Is there any benefit to maintain the key as a binary array a fixed length N
? This binary array is such that B_S[i]=1
if the i
th customer is in set S
, and it is 0 otherwise. Does this make the lookup any easier?
I am aware that ultimately the design choice between reevaluating f(S)
each time versus using std::map
would have to be done based on actual profiling of my application. However, before implementing both ideas (the std::map
route is more difficult to code in my underlying application), I would like to know if there are any known pre-existing best-practices / benchmarks.
Complexity of lookup in a map is O(log N)
That is, roughly log N
comparisons are needed when there are N
elements in the map. The cost of the comparison itself adds to that linearly. For example when you compare M
vectors with K
elements, then there are roughly log N
comparisons, each comparing M*K
vector elements, ie in total O(M*K*log N)
.
However, asymptotic complexity is only that: Asymptotic complexity. When there are only a small number of elements in the map then lower order factors might outweigh the log N
that only dominates for large N
. Consequently, the actual runtime depends on your specific application and you need to measure to be sure.
Moreover, you shouldn't use vectors as keys in the first place. Its a waste of memory. Subsets of S
can be enumerated with a n-bit integer when S
has n
elements (simply set the i-th bit when i-th element of S
is in the subset). Comparing a single integer (or bitset) is surely more efficient than comparing vectors of integers.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.