简体   繁体   English

检查unordered_set是否包含其他unordered_set中的所有元素-C ++

[英]Check if unordered_set contains all elements in other unordered_set - C++

I'm brand new to C++ and have been asked to convert a Java program to C++. 我是C ++的新手,并且被要求将Java程序转换为C ++。 I'm trying to write a method to check that all elements in an unordered_set exist in another unordered_set. 我正在尝试编写一种方法来检查unordered_set中的所有元素是否在另一个unordered_set中。 I found the example below using hash_set but hash_set is deprecated and it is recommended to use unordered_set now. 我发现下面的示例使用hash_set,但是不建议使用hash_set,建议立即使用unordered_set。

// returns true if one contains all elements in two
bool SpecSet::containsAll(hash_set<Species*> one, hash_set<Species*> two) {
   sort(one.begin(), one.end());
   sort(two.begin(), two.end());
   return includes(one.begin(), one.end(), two.begin(), two.end());
}

So I need a way to do this using unordered_set. 因此,我需要一种使用unordered_set进行此操作的方法。 Sort does not work on unordered sets and lookup speed is important, so I don't want to use an ordered set. 排序不适用于无序集合,并且查找速度很重要,因此我不想使用有序集合。

bool SpecSet::containsAll(unordered_set<Species*> one, unordered_set<Species*> two) {

   return ?;
}

I'd really appreciate some help with an approach to doing this efficiently. 我非常感谢能有效实现此目的的一些帮助。

EDIT: I guess this will work. 编辑:我想这会工作。 It seems there is no more efficient way but to loop over all in two. 似乎没有比这更有效的方法了,而是将所有内容一分为二。

bool SpecSet::containsAll(unordered_set<Species*> one, unordered_set<Species*> two) {
   if(two.size() > one.size())
   {
      return false;
   }

   for(Species *species : two)
   {
      if(one.find(species) == one.end())
      {
         return false;
      }
   }
   return true;
}

Disclaimer: This is not the most efficient approach. 免责声明:这不是最有效的方法。 It is an attempt at solution that would be as generic and flexible as std::includes while supporting unordered iterator ranges. 这是对解决方案的尝试,它将在支持无序迭代器范围的情况下与std::includes一样通用和灵活。 It is not limited to std::unordered_set and should work for any other container, eg, std::vector or std::list . 它不限于std::unordered_set并且应可用于任何其他容器,例如std::vectorstd::list


As it was pointed out std::includes requires the input ranges to be sorted. 如前所述, std::includes要求对输入范围进行排序。 At this moment unordered ranges are not supported in the standard library. 目前,标准库中不支持无序范围。

Looking at possible implementations of std::includes a version for unordered ranges can be implemented. 查看std::includes的可能实现,可以实现用于无序范围的版本。 For example like so: 例如这样:

template<class InputIt1, class InputIt2>
bool includes_unordered(
    InputIt1 first1, InputIt1 last1,
    InputIt2 first2, InputIt2 last2)
{
    for (; first2 != last2; ++first2)
    {
        InputIt1 it1;
        for (it1 = first1; it1 != last1; ++it1)
        {
            if(*first2 == *it1)
                break;
        }
        if (it1 == last1)
            return false;
    }
    return true;
}

Note: containers' size-comparison optimization is not performed to support containers of non-unique objects. 注意:不执行容器的尺寸比较优化来支持非唯一对象的容器。 But if needed it can be done using std::distance . 但是,如果需要,可以使用std::distance

And here's a version taking an equivalence operator: 这是采用等效运算符的版本:

template<class InputIt1, class InputIt2, class Equivalence>
bool includes_unordered(
    InputIt1 first1, InputIt1 last1,
    InputIt2 first2, InputIt2 last2,
    Equivalence equiv)
{
    for (; first2 != last2; ++first2)
    {
        InputIt1 it1;
        for (it1 = first1; it1 != last1; ++it1)
        {
            if(equiv(*first2, *it1))
                break;
        }
        if (it1 == last1)
            return false;
    }
    return true;
}

Small live-example 小例子

Then includes_unordered can be used in the same way as std::includes would. 然后includes_unordered可以与std::includes相同的方式使用。

With unsorted collections, there's no faster algorithm than to iterate over the smaller collection while testing that each element is a member of the larger. 对于未排序的集合,没有比遍历较小的集合并测试每个元素是较大的元素的成员更快的算法。 This will naturally scale as O( n ), where n is the size of the putative subset, since we perform an O(1) find operation n times. 这自然会缩放为O( n ),其中n是假定子集的大小,因为我们执行了n次O(1)查找操作。


Here's some demonstration code, with tests: 这是一些带有测试的演示代码:

#include <unordered_set>

template <typename T>
bool is_subset_of(const std::unordered_set<T>& a, const std::unordered_set<T>& b)
{
    // return true if all members of a are also in b
    if (a.size() > b.size())
        return false;

    auto const not_found = b.end();
    for (auto const& element: a)
        if (b.find(element) == not_found)
            return false;

    return true;
}
int main()
{
    const std::unordered_set<int> empty{ };
    const std::unordered_set<int> small{ 1, 2, 3 };
    const std::unordered_set<int> large{ 0, 1, 2, 3, 4 };
    const std::unordered_set<int> other{ 0, 1, 2, 3, 9 };

    return 0
        +  is_subset_of(small, empty) // small ⊄ ∅
        + !is_subset_of(empty, small) // ∅ ⊂ small
        +  is_subset_of(large, small) // large ⊄ small
        + !is_subset_of(small, large) // small ⊂ large
        +  is_subset_of(large, other) // large ⊄ other
        +  is_subset_of(other, large) // other ⊄ large
        + !is_subset_of(empty, empty) // ∅ ⊂ ∅
        + !is_subset_of(large, large) // x ⊂ x, ∀x
        ;
}

An equivalent, using standard algorithm instead of writing an explicit loop: 使用标准算法而不是编写显式循环的等效方法:

#include <algorithm>
#include <unordered_set>

template <typename T>
bool is_subset_of(const std::unordered_set<T>& a, const std::unordered_set<T>& b)
{
    // return true if all members of a are also in b
    auto const is_in_b = [&b](auto const& x){ return b.find(x) != b.end(); };

    return a.size() <= b.size() && std::all_of(a.begin(), a.end(), is_in_b);
}

(obviously using the same main() for tests) (显然使用相同的main()进行测试)


Note that we pass the sets by reference , not by value, as you've indicated that the sets are too large for you to copy and sort them. 请注意,我们通过引用而不是通过值传递集合,因为您已经表明这些集合太大,无法复制和排序。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM