简体   繁体   English

为什么 std::set 没有“包含”成员函数?

[英]Why does std::set not have a "contains" member function?

I'm heavily using std::set<int> and often I simply need to check if such a set contains a number or not.我大量使用std::set<int>并且通常我只需要检查这样的集合是否包含数字。

I'd find it natural to write:我会觉得很自然地写:

if (myset.contains(number))
   ...

But because of the lack of a contains member, I need to write the cumbersome:但是因为缺少contains成员,所以需要编写繁琐的:

if (myset.find(number) != myset.end())
  ..

or the not as obvious:或不那么明显:

if (myset.count(element) > 0) 
  ..

Is there a reason for this design decision ?这个设计决定有理由吗?

I think it was probably because they were trying to make std::set and std::multiset as similar as possible.我认为这可能是因为他们试图使std::setstd::multiset尽可能相似。 (And obviously count has a perfectly sensible meaning for std::multiset .) (显然countstd::multiset具有完全合理的含义。)

Personally I think this was a mistake.我个人认为这是一个错误。

It doesn't look quite so bad if you pretend that count is just a misspelling of contains and write the test as:如果您假装count只是contains的拼写错误并将测试编写为:

if (myset.count(element)) 
   ...

It's still a shame though.尽管如此,这仍然是一种耻辱。

To be able to write if (s.contains()) , contains() has to return a bool (or a type convertible to bool , which is another story), like binary_search does.为了能够编写if (s.contains())contains()必须返回一个bool (或可转换为bool的类型,这是另一回事),就像binary_search一样。

The fundamental reason behind the design decision not to do it this way is that contains() which returns a bool would lose valuable information about where the element is in the collection .设计决定这样做的根本原因是,返回boolcontains()丢失有关元素在集合中位置的有价值信息 find() preserves and returns that information in the form of an iterator, therefore is a better choice for a generic library like STL. find()以迭代器的形式保存和返回该信息,因此对于像 STL 这样的通用库来说是更好的选择。 This has always been the guiding principle for Alex Stepanov, as he has often explained (for example, here ).正如他经常解释的那样(例如,这里),这一直是 Alex Stepanov 的指导原则。

As to the count() approach in general, although it's often an okay workaround, the problem with it is that it does more work than a contains() would have to do .至于一般的count()方法,虽然它通常是一个好的解决方法,但它的问题是它比contains()必须做的工作更多。

That is not to say that a bool contains() isn't a very nice-to-have or even necessary.这并不是说bool contains()不是一个很好的东西,甚至不是必需的。 A while ago we had a long discussion about this very same issue in the ISO C++ Standard - Future Proposals group.不久前,我们在 ISO C++ Standard - Future Proposals 小组中就同样的问题进行了长时间的讨论

It lacks it because nobody added it.它缺少它,因为没有人添加它。 Nobody added it because the containers from the STL that the std library incorporated where designed to be minimal in interface.没有人添加它,因为std库所包含的 STL 中的容器被设计为接口最小化。 (Note that std::string did not come from the STL in the same way). (请注意, std::string并非以同样的方式来自 STL)。

If you don't mind some strange syntax, you can fake it:如果你不介意一些奇怪的语法,你可以伪造它:

template<class K>
struct contains_t {
  K&& k;
  template<class C>
  friend bool operator->*( C&& c, contains_t&& ) {
    auto range = std::forward<C>(c).equal_range(std::forward<K>(k));
    return range.first != range.second;
    // faster than:
    // return std::forward<C>(c).count( std::forward<K>(k) ) != 0;
    // for multi-meows with lots of duplicates
  }
};
template<class K>
containts_t<K> contains( K&& k ) {
  return {std::forward<K>(k)};
}

use:利用:

if (some_set->*contains(some_element)) {
}

Basically, you can write extension methods for most C++ std types using this technique.基本上,您可以使用这种技术为大多数 C++ std类型编写扩展方法。

It makes a lot more sense to just do this:这样做更有意义:

if (some_set.count(some_element)) {
}

but I am amused by the extension method method.但我被扩展方法方法逗乐了。

The really sad thing is that writing an efficient contains could be faster on a multimap or multiset , as they just have to find one element, while count has to find each of them and count them .真正可悲的是,在multimapmultiset上编写有效的contains可能会更快,因为它们只需要找到一个元素,而count必须找到它们中的每一个并计算它们

A multiset containing 1 billion copies of 7 (you know, in case you run out) can have a really slow .count(7) , but could have a very fast contains(7) .一个包含 10 亿个 7 副本的多重集(你知道,万一你用完了)可能有一个非常慢的.count(7) ,但可能有一个非常快的contains(7)

With the above extension method, we could make it faster for this case by using lower_bound , comparing to end , and then comparing to the element.使用上述扩展方法,我们可以通过使用lower_bound ,与end比较,然后与元素进行比较来使其更快。 Doing that for an unordered meow as well as an ordered meow would require fancy SFINAE or container-specific overloads however.然而,对无序的喵和有序的喵这样做需要花哨的 SFINAE 或特定于容器的重载。

You are looking into particular case and not seeing bigger picture.您正在研究特定情况,而没有看到更大的图景。 As stated in documentation std::set meets requirement of AssociativeContainer concept.文档std::set中所述,满足AssociativeContainer概念的要求。 For that concept it does not make any sense to have contains method, as it is pretty much useless for std::multiset and std::multimap , but count works fine for all of them.对于这个概念, contains方法没有任何意义,因为它对于std::multisetstd::multimap几乎没有用,但count对它们都适用。 Though method contains could be added as an alias for count for std::set , std::map and their hashed versions (like length for size() in std::string ), but looks like library creators did not see real need for it.虽然方法contains可以添加为std::setstd::map及其散列版本的count的别名(如std::stringsize()length ),但看起来库创建者并没有真正需要它。

Although I don't know why std::set has no contains but count which only ever returns 0 or 1 , you can write a templated contains helper function like this:虽然我不知道为什么std::set没有containscount只返回01 ,但您可以编写一个模板化的contains辅助函数,如下所示:

template<class Container, class T>
auto contains(const Container& v, const T& x)
-> decltype(v.find(x) != v.end())
{
    return v.find(x) != v.end();
}

And use it like this:并像这样使用它:

    if (contains(myset, element)) ...

The true reason for set is a mystery for me, but one possible explanation for this same design in map could be to prevent people from writing inefficient code by accident: set的真正原因对我来说是个谜,但对map中相同设计的一种可能解释可能是防止人们意外编写低效代码:

if (myMap.contains("Meaning of universe"))
{
    myMap["Meaning of universe"] = 42;
}

Which would result in two map lookups.这将导致两个map查找。

Instead, you are forced to get an iterator.相反,您被迫获得一个迭代器。 This gives you a mental hint that you should reuse the iterator:这给你一个心理暗示,你应该重用迭代器:

auto position = myMap.find("Meaning of universe");
if (position != myMap.cend())
{
    position->second = 42;
}

which consumes only one map lookup.它只消耗一个map查找。

When we realize that set and map are made from the same flesh, we can apply this principle also to set .当我们意识到setmap是由同一个肉体构成时,我们也可以将这个原则应用到set上。 That is, if we want to act on an item in the set only if it is present in the set , this design can prevent us from writing code as this:也就是说,如果我们只想对set中存在的 item set操作,那么这种设计可以阻止我们编写如下代码:

struct Dog
{
    std::string name;
    void bark();
}

operator <(Dog left, Dog right)
{
    return left.name < right.name;
}

std::set<Dog> dogs;
...
if (dogs.contain("Husky"))
{
    dogs.find("Husky")->bark();
}

Of course all this is a mere speculation.当然,这一切都只是猜测。

Since c++20,从 c++20 开始,

bool contains( const Key& key ) const

is available.可用。

What about binary_search ?那么 binary_search 呢?

 set <int> set1;
 set1.insert(10);
 set1.insert(40);
 set1.insert(30);
 if(std::binary_search(set1.begin(),set1.end(),30))
     bool found=true;

contains() has to return a bool. contains() 必须返回一个布尔值。 Using C++ 20 compiler I get the following output for the code:使用 C++ 20 编译器,我得到以下代码输出:

#include<iostream>
#include<map>
using namespace std;

int main()
{
    multimap<char,int>mulmap;
    mulmap.insert(make_pair('a', 1)); //multiple similar key
    mulmap.insert(make_pair('a', 2)); //multiple similar key
    mulmap.insert(make_pair('a', 3)); //multiple similar key
    mulmap.insert(make_pair('b', 3));
    mulmap.insert({'a',4});
    mulmap.insert(pair<char,int>('a', 4));
    
    cout<<mulmap.contains('c')<<endl;  //Output:0 as it doesn't exist
    cout<<mulmap.contains('b')<<endl;  //Output:1 as it exist
}

I'd like to point out , as mentioned by Andy, that since C++20 the standard added the contains Member function for maps or set:我想指出,正如 Andy 所提到的,自 C++20 以来,该标准为地图或集合添加了 contains 成员函数:

bool contains( const Key& key ) const;  (since C++20)

Now I'd like to focus my answer regarding performance vs readability.现在我想将我的答案集中在性能与可读性方面。 In term of performance if you compare the two versions:如果比较两个版本,就性能而言:

#include <unordered_map>
#include <string>
using hash_map = std::unordered_map<std::string,std::string>;
hash_map a;

std::string get_cpp20(hash_map& x,std::string str)
{
    if(x.contains(str))
        return x.at(str);
    else
        return "";
};

std::string get_cpp17(hash_map& x,std::string str)
{
    if(const auto it = x.find(str); it !=x.end())
        return it->second;
    else
        return "";
};

You will find that the cpp20 version takes two calls to std::_Hash_find_last_result while the cpp17 takes only one call.你会发现 cpp20 版本需要两次调用std::_Hash_find_last_result而 cpp17 只需要一次调用。

Now I find myself with many data structure with nested unordered_map.现在我发现自己有许多带有嵌套 unordered_map 的数据结构。 So you end up with something like this:所以你最终会得到这样的结果:

using my_nested_map = std::unordered_map<std::string,std::unordered_map<std::string,std::unordered_map<int,std::string>>>;

std::string get_cpp20_nested(my_nested_map& x,std::string level1,std::string level2,int level3)
{
    if(x.contains(level1) &&
        x.at(level1).contains(level2) &&
        x.at(level1).at(level2).contains(level3))

        return x.at(level1).at(level2).at(level3);
    else
        return "";
};

std::string get_cpp17_nested(my_nested_map& x,std::string level1,std::string level2,int level3)
{
    if(const auto it_level1=x.find(level1); it_level1!=x.end())
        if(const auto it_level2=it_level1->second.find(level2);it_level2!=it_level1->second.end())
            if(const auto it_level3=it_level2->second.find(level3);it_level3!=it_level2->second.end())
                return it_level3->second;

    return "";
};

Now if you have plenty of condition in-between these ifs, using the iterator really is painful, very error prone and unclear, I often find myself looking back at the definition of the map to understand what kind of object was at level 1 or level2, while with the cpp20 version , you see at(level1).at(level2) .... and understand immediately what you are dealing with.现在,如果您在这些 if 之间有很多条件,使用迭代器确实很痛苦,非常容易出错且不清楚,我经常发现自己回顾映射的定义以了解在级别 1 或级别 2 的对象类型,而使用 cpp20 版本时,您会看到at(level1).at(level2) .... 并立即了解您正在处理的内容。 So in term of code maintenance/review, contains is a very nice addition.所以在代码维护/审查方面, contains是一个非常好的补充。

Another reason is that it would give a programmer the false impression that std::set is a set in the math set theory sense.另一个原因是它会给程序员一种错误的印象,即 std::set 是数学集合论意义上的集合。 If they implement that, then many other questions would follow: if an std::set has contains() for a value, why doesn't it have it for another set?如果他们实现了这一点,那么就会出现许多其他问题:如果一个 std::set 有 contains() 作为一个值,为什么它没有另一个集呢? Where are union(), intersection() and other set operations and predicates? union()、intersection() 和其他集合操作和谓词在哪里?

The answer is, of course, that some of the set operations are already implemented as functions in (std::set_union() etc.) and other are as trivially implemented as contains().答案当然是,一些集合操作已经被实现为(std::set_union() 等)中的函数,而其他的像 contains() 一样简单地实现。 Functions and function objects work better with math abstractions than object members, and they are not limited to the particular container type.函数和函数对象比对象成员更适用于数学抽象,并且它们不限于特定的容器类型。

If one need to implement a full math-set functionality, he has not only a choice of underlying container, but also he has a choice of implementation details, eg, would his theory_union() function work with immutable objects, better suited for functional programming, or would it modify its operands and save memory?如果需要实现完整的数学集功能,他不仅可以选择底层容器,还可以选择实现细节,例如,他的 theory_union() 函数是否适用于不可变对象,更适合函数式编程,还是会修改其操作数并节省内存? Would it be implemented as function object from the start or it'd be better to implement is a C-function, and use std::function<> if needed?是从一开始就将其实现为函数对象,还是最好实现一个 C 函数,并在需要时使用 std::function<> ?

As it is now, std::set is just a container, well-suited for the implementation of set in math sense, but it is nearly as far from being a theoretical set as std::vector from being a theoretical vector.就像现在一样,std::set 只是一个容器,非常适合数学意义上的 set 的实现,但它与 std::vector 与理论向量的距离几乎一样远。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 为什么std :: map有一个find成员函数? - Why does std::map have a find member function? 为什么 std::set 可以与我的自由函数 operator&lt; 一起使用,但不能与我的类成员函数 operator&lt; 一起使用? - Why does std::set work with my free function operator<, but not my class member function operator<? 为什么 std::iterator 不包含 std::prev() 作为成员函数? - Why does std::iterator not contain std::prev() as a member function? 在没有副本构造函数的对象的成员函数中启动std :: thread - Start std::thread in member function of object that does not have a copy constructor 为什么成员函数需要'&'(例如在std :: bind中)? - Why does a member function needs '&' (e.g. in std::bind)? 为什么std :: promise :: set_value()有两个重载 - Why does std::promise::set_value() have two overloads 为什么 std::set / std::map 和 std::unordered_set / std::unordered_map 没有 std::erase 重载但有 std::erase_if? - Why std::set / std::map and std::unordered_set / std::unordered_map does NOT have std::erase overload but have std::erase_if? C ++:std没有成员“ string” - C++: std does not have member “string” 使用成员 function 创建 std::function 无法编译 - Creating std::function with a member function does not compile 为什么std :: binary_function &lt;...&gt;没有操作符()方法? - Why does std::binary_function<…> not have an operator() method?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM