简体   繁体   中英

How to detect duplicates in a vector of unordered_map?

Given a vector of unordered_map<u_int,int> , I would like to check if the vector contains any duplicated values. Two unordered_maps are considered duplicated if all of their keys and their corresponding values are equal. I know the comparison operator exists for unordered_maps , but I would like to avoid the pairwise comparison of each element with each other. One classical solution is to insert the values of the vector into a set , then to compare the number of elements in the set and the vector . However, the problem here is that the object to be inserted into the set must have the comparison operators overloaded. In case of the unordered_set , the hash function to be used must be overloaded for the complex object. In order to overload, I need to derive a class from the std::unordered_map . Then I need to overload either the comparison operator or the hash function. Another solution that I could think of is to concatenate all of the key value pairs into a string, then sort the string by the keys and detect the duplicates on those strings. I wonder what would be the best solution for this problem.
Example data:

using namespace std;
typedef unordered_map<u_int,int> int_map;
int_map a = { {1,1}, {2,4}, {3,5} };
int_map b = { {1,1}, {2,-1}, {4,-2} };
int_map c = { {1,1}, {3,5} };

vector<unordered_map<u_int,int>> my_vec;

my_vec.push_back(a);
my_vec.push_back(b);
my_vec.push_back(c);

The contents of my_vec is:

 { { 1 => 1, 2 => 4, 3 => 5 }, 
 { 1 => 1, 2 => -1, 4 => -2 }, 
 { 1 => 1, 3 => 5 } }

Please feel free to ask/commend/edit if the question is not clear enough. Any help would be appreciated. Thank you in advance!

you can something similar to the following :

typedef unordered_map<u_int,int> int_map;

struct my_map_comparator
{
    bool operator()(const int_map& a, const int_map& b) 
    { 
      a_hash = compute_hash_for_a(all keys of a)
      b_hash = compute_hash_for_b(all keys of b)

      return a_hash == b_hash; 
    }
};

std::unordered_set<int_map,std::hash<int_map>, my_map_comparator> map_list();

If you can get a good hash function for std::unordered_map then you should do it like this probably:

bool has_distinct_values(const std::vector<std::unordered_map<u_int, int>> v)
{
  std::unordered_map<int, std::list<int>> hash_to_indexes_map; 
  for(auto i = 0u; i < v.size(); ++i)
  {
    auto empl_result = hash_to_index_map.emplace(custom_hash(v[i]), {i});
    if (!empl_result.second)
    {  
       for (auto index : empl_result.first->second)
       {
         if (v[index] == v[i]) return false;
       }
       epmpl_result.first->second.push_back(i);
    }
  }
  return true;
}

The algorithm is straightforward: map hashes to list indexes, doing pairwise map comparison whenever hashes are equal. This way you avoid copying the entire maps, get O(N) (depending mostly on the quality of the hash function you provide) time complexity and generally are good to go.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM