简体   繁体   中英

How can i get the top n keys of std::map based on their values?

How can i get the top n keys of std::map based on their values? Is there a way that i can get a list of say for example the top 10 keys with the biggest value as their values?
Suppose we have a map similar to this :

mymap["key1"]= 10;
mymap["key2"]= 3;
mymap["key3"]= 230;
mymap["key4"]= 15;
mymap["key5"]= 1;
mymap["key6"]= 66;
mymap["key7"]= 10; 

And i only want to have a list of top 10 keys which has a bigger value compared to the other. for example the top 4 for our mymap is

key3
key6
key4 
key1
key10 

note:
the values are not unique, actually they are the number of occurrences of each key. and i want to get a list of most occurred keys

note 2:
if map is not a good candidate and you want to suggest anything, please do it according to the c++11 ,i cant use boost at the time.

note3:
in case of using std::unordered_multimap<int,wstring> do i have any other choices?

The order of a map is based on its key and not its values and cannot be reordered so it is necessary to iterate over the map and maintain a list of the top ten encountered or as commented by Potatoswatter use partial_sort_copy() to extract the top N values for you:

std::vector<std::pair<std::string, int>> top_four(4);
std::partial_sort_copy(mymap.begin(),
                       mymap.end(),
                       top_four.begin(),
                       top_four.end(),
                       [](std::pair<const std::string, int> const& l,
                          std::pair<const std::string, int> const& r)
                       {
                           return l.second > r.second;
                       });

See online demo .

Choosing a different type of container may be more appropriate, boost::multi_index would be worth investigating, which:

... enables the construction of containers maintaining one or more indices with different sorting and access semantics.

#include <iostream>
#include <map>
#include <vector>
#include <algorithm>
#include <string>
using namespace std;

int main(int argc, const char * argv[])
{
    map<string, int> entries;

    // insert some random entries
    for(int i = 0; i < 100; ++i)
    {
        string name(5, 'A' + (char)(rand() % (int)('Z' - 'A') ));
        int number = rand() % 100;

        entries.insert(pair<string, int>(name, number));
    }

    // create container for top 10
    vector<pair<string, int>> sorted(10);

    // sort and copy with reversed compare function using second value of std::pair
    partial_sort_copy(entries.begin(), entries.end(),
                      sorted.begin(), sorted.end(),
                      [](const pair<string, int> &a, const pair<string, int> &b)
    {
        return !(a.second < b.second);
    });

    cout << endl << "all elements" << endl;

    for(pair<string, int> p : entries)
    {
        cout << p.first << "  " << p.second << endl;
    }

    cout << endl << "top 10" << endl;

    for(pair<string, int> p : sorted)
    {
        cout << p.first << "  " << p.second << endl;
    }

    return 0;
}

Not only does std::map not sort by mapped-to value (such values need not have any defined sorting order), it doesn't allow rearrangement of its elements, so doing ++ map[ "key1" ]; on a hypothetical structure mapping the values back to the keys would invalidate the backward mapping.

Your best bet is to put the key-value pairs into another structure, and sort that by value at the time you need the backward mapping. If you need the backward mapping at all times, you would have to remove, modify, and re-add each time the value is changed.

The most efficient way to sort the existing map into a new structure is std::partial_sort_copy , as (just now) illustrated by Al Bundy.

since the mapped values are not indexed, you would have to read everything and select the 10 biggest values.

std::vector<mapped_type> v;
v.reserve(mymap.size());

for(const auto& Pair : mymap)
 v.push_back( Pair.second );

std::sort(v.begin(), v.end(), std::greater<mapped_type>());

for(std::size_t i = 0, n = std::min<int>(10,v.size()); i < n; ++i)
  std::cout << v[i] << ' ';

another way, is to use two maps or a bimap, thus mapped values would be ordered.

The algorithm you're looking for is nth_element , which partially sorts a range so that the nth element is where it would be in a fully sorted range. For example, if you wanted the top three items in descending order, you'd write (in pseudo C++)

nth_element(begin, begin + 3, end, predicate)

The problem is nth_element doesn't work with std::map. I would therefore suggest you change your data structure to a vector of pairs (and depending on the amount of data you're dealing with, you may find this to be a quicker data structure anyway). So, in the case of your example, I'd write it like this:

typedef vector<pair<string, int>> MyVector;
typedef MyVector::value_type ValueType;

MyVector v; 

// You should use an initialization list here if your
// compiler supports it (mine doesn't...)
v.emplace_back(ValueType("key1", 10));
v.emplace_back(ValueType("key2", 3));
v.emplace_back(ValueType("key3", 230));
v.emplace_back(ValueType("key4", 15));
v.emplace_back(ValueType("key5", 1));
v.emplace_back(ValueType("key6", 66));
v.emplace_back(ValueType("key7", 10));

nth_element(v.begin(), v.begin() + 3, v.end(), 
    [](ValueType const& x, ValueType const& y) -> bool
    {
        // sort descending by value
        return y.second < x.second;
    });

// print out the top three elements
for (size_t i = 0; i < 3; ++i)
    cout << v[i].first << ": " << v[i].second << endl;
#include "stdafx.h"
#include <iostream>
#include <vector>
#include <map>
#include <string>
#include <algorithm>
#include <cassert>
#include <iterator>
using namespace std;

class MyMap
{
public:
    MyMap(){};
    void addValue(string key, int value)
    {
        _map[key] = value;
        _vec.push_back(make_pair(key, value));
        sort(_vec.begin(), _vec.end(), Cmp());
    }
    vector<pair<string, int> > getTop(int n)
    {
        int len = min((unsigned int)n, _vec.size());
        vector<Pair> res;
        copy(_vec.begin(), _vec.begin() + len, back_inserter(res));
        return res;
    }
private:
    typedef map<string, int> StrIntMap;
    typedef vector<pair<string, int> > PairVector;
    typedef pair<string, int> Pair;
    StrIntMap  _map;
    PairVector _vec;
    struct Cmp: 
        public binary_function<const Pair&, const Pair&, bool>
    {
        bool operator()(const Pair& left, const Pair& right)
        {
            return right.second < left.second;
        }
    };
};

int main()
{
    MyMap mymap;
    mymap.addValue("key1", 10);
    mymap.addValue("key2", 3);
    mymap.addValue("key3", 230);
    mymap.addValue("key4", 15);
    mymap.addValue("key6", 66);
    mymap.addValue("key7", 10);

    auto res = mymap.getTop(3);

    for_each(res.begin(), res.end(), [](const pair<string, int> value)
                                        {cout<<value.first<<" "<<value.second<<endl;});
}

The simplest solution would be to use std::transform to build a second map:

typedef std::map<int, std::string> SortedByValue;
SortedByValue map2;
std::transform(
    mymap.begin(), mymap.end(),
    std::inserter( map2, map2.end() ),
    []( std::pair<std::string, int> const& original ) {
        return std::pair<int, std::string>( original.second, original.first );
        } );

Then pick off the last n elements of map2 .

Alternatively (and probably more efficient), you could use an std::vector<std::pair<int, std::string>> and sort it afterwards:

std::vector<std::pair<int, std::string>> map2( mymap.size() );
std::transform(
    mymap.begin(), mymap.end()
    map2.begin(),
    []( std::pair<std::string, int> const& original ) {
        return std::pair<int, std::string>( original.second, original.first );
        } );
std::sort( map2.begin(), map2.end() );

(Note that these solutions optimize for time, at the cost of more memory.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM