C++ Find in a vector of <int, pair>

Question

So previously I only had 1 key I needed to look up, so I was able to use a map:

std::map <int, double> freqMap;

But now I need to look up 2 different keys. I was thinking of using a vector with std::pair ie:

std::vector <int, std::pair<int, double>> freqMap;

Eventually I need to look up both keys to find the correct value. Is there a better way to do this, or will this be efficient enough (will have ~3k entries). Also, not sure how to search using the second key (first key in the std::pair). Is there a find for the pair based on the first key? Essentially I can access the first key by:

freqMap[key1]

But not sure how to iterate and find the second key in the pair.

Edit: Ok adding the use case for clarification:

I need to look up a val based on 2 keys, a mux selection and a frequency selection. The raw data looks something like this:

Mux, Freq, Val
0, 1000, 1.1
0, 2000, 2.7
0, 10e9, 1,7
1, 1000, 2.2
1, 2500, 0.8
6, 2000, 2.2

Answer 1

The blanket answer to "which is faster" is generally "you have to benchmark it".

But besides that, you have a number of options. A std::map is more efficient than other data structures on paper, but not necessarily in practice. If you truly are in a situation where this is performance critical (ie avoid premature optimisation) try different approaches, as sketched below, and measure the performance you get (memory-wise and cpu-wise).

Instead of using a std::map , consider throwing your data into a struct , give it proper names and store all values in a simple std::vector . If you modify the data only seldom, you can optimise retrieval cost at the expense of additional insertion cost by sorting the vector according to the key you are typically using to find an entry. This will allow you to do binary search , which can be much faster than linear search .

However, linear search can be surprisingly fast on a std::vector because of both cache locality and branch prediction . Both of which you are likely to lose when dealing with a map, unordered_map or (binary searched) sorted vector. So, although O(n) sounds much more scary than, say, O(log n) for map or even O(1) for unordered_map, it can still be faster under the right conditions.

Especially if you discover that you don't have a discernible index member you can use to sort your entries, you will have to either stick to linear search in contiguous memory (ie vector) or invest into a doubly indexed data structure (effectively something akin to two maps or two unordered_maps). Having two indexes usually prevents you from using a single map/unordered_map.

If you can pack your data more tightly (ie do you need an int or would a std::uint8_t do the job?, do you need a double ? etc.) you will amplify cache locality and for only 3k entries you have good chances of an unsorted vector to perform best. Although operations on an std::size_t are typically faster themselves than on smaller types, iterating over contiguous memory usually offsets this effect.

Conclusion: Try an unsorted vector, a sorted vector (+binary search), a map and an unordered_map. Do proper benchmarking (with several repetitions) and pick the fastest one. If it doesn't make a difference pick the one that is the most straight-forward to understand.

Edit: Given your example data, it sounds like the first key has an extremely small domain. As far as I can tell "Mux" seems to be limited to a small number of different values which are near each other, in such a situation you may consider using an std::array as your primary indexing structure and have a suitable lookup structure as your second one. For example:

std::array<std::vector<std::pair<std::uint64_t,double>>,10>
std::array<std::unordered_map<std::uint64_t,double>,10>

C++ Find in a vector of <int, pair>

Question

1 answers

solution1
3 ACCPTED 2020-08-28 16:56:26

C++ Find in a vector of <int, pair>

Question

1 answers

solution1 3 ACCPTED 2020-08-28 16:56:26

solution1
3 ACCPTED 2020-08-28 16:56:26