简体   繁体   English

`std :: set`有什么问题?

[英]What is wrong with `std::set`?

In the other topic I was trying to solve this problem. 另一个主题中,我试图解决这个问题。 The problem was to remove duplicate characters from a std::string . 问题是从std::string删除重复的字符。

std::string s= "saaangeetha";

Since the order was not important, so I sorted s first, and then used std::unique and finally resized it to get the desired result : 由于订单并不重要,所以我首先排序s ,然后使用std::unique ,最后调整它以获得所需的结果

aeghnst

That is correct! 那是对的!


Now I want to do the same, but at the same time I want the order of characters intact. 现在我想做同样的事,但同时我希望字符的顺序完好无损。 Means, I want this output: 意思是,我想要这个输出:

sangeth

So I wrote this : 所以我写了这个

template<typename T>
struct is_repeated
{
    std::set<T>  unique;
    bool operator()(T c) { return !unique.insert(c).second; }
}; 
int main() {
    std::string s= "saaangeetha";
    s.erase(std::remove_if(s.begin(), s.end(), is_repeated<char>()), s.end()); 
    std::cout << s ;
}

Which gives this output: 这给出了这个输出:

saangeth

That is, a is repeated, though other repetitions gone. 也就是说, a是重复的,但其他的重复了。 What is wrong with the code? 代码有什么问题?

Anyway I change my code a bit: (see the comment) 无论如何我改变了我的代码 :(见评论)

template<typename T>
struct is_repeated
{
    std::set<T> & unique;  //made reference!
    is_repeated(std::set<T> &s) : unique(s) {} //added line!
    bool operator()(T c) { return !unique.insert(c).second; }
}; 
int main() {
    std::string s= "saaangeetha";
    std::set<char> set; //added line!
    s.erase(std::remove_if(s.begin(),s.end(),is_repeated<char>(set)),s.end()); 
    std::cout << s ;
}

Output: 输出:

sangeth

Problem gone! 问题消失了!

So what is wrong with the first solution? 那么第一个解决方案有什么问题?

Also, if I don't make the member variable unique reference type, then the problem doesn't go . 另外,如果我不使成员变量unique引用类型,那么问题就不会出现

What is wrong with std::set or is_repeated functor? std::setis_repeated functor有什么问题? Where exactly is the problem? 问题究竟在哪里?

I also note that if the is_repeated functor is copied somewhere, then every member of it is also copied. 我还注意到,如果is_repeated函数被复制到某处,那么它的每个成员也会被复制。 I don't see the problem here! 我没有看到这里的问题!

Functors are supposed to be designed in a way where a copy of a functor is identical to the original functor. 函数应该以一种仿函数的副本与原始仿函数相同的方式设计。 That is, if you make a copy of one functor and then perform a sequence of operations, the result should be the same no matter which functor you use, or even if you interleave the two functors. 也就是说,如果您复制一个仿函数然后执行一系列操作,无论您使用哪个仿函数,或者即使您将两个仿函数交错,结果也应该相同。 This gives the STL implementation the flexibility to copy functors and pass them around as it sees fit. 这使STL实现可以灵活地复制仿函数并在它认为合适时传递它们。

With your first functor, this claim does not hold because if I copy your functor and then call it, the changes you make to its stored set do not reflect in the original functor, so the copy and the original will perform differently. 使用您的第一个仿函数,此声明不成立,因为如果我复制您的仿函数然后调用它,您对其存储集所做的更改不会反映在原始仿函数中,因此副本和原始文件的执行方式会有所不同。 Similarly, if you take your second functor and make it not store its set by reference, the two copies of the functor will not behave identically. 类似地,如果您使用第二个仿函数并使其不通过引用存储其集合,则仿函数的两个副本将不会表现相同。

The reason that your final version of the functor works, though, is because the fact that the set is stored by reference means that any number of copies of tue functor will behave identically to one another. 但是,最后版本的仿函数的工作原因是因为该集合是通过引用存储的,这意味着任何数量的tue仿函数副本的行为都相同。

Hope this helps! 希望这可以帮助!

In GCC (libstdc++) , remove_if is implemented essentially as 在GCC(libstdc ++)中remove_if基本上实现为

    template<typename It, typename Pred>
    It remove_if(It first, It last, Pred predicate) {
      first = std::find_if(first, last, predicate);
    //                                  ^^^^^^^^^
      if (first == last)
         return first;
      else {
         It result = first;
         ++ result;
         for (; first != last; ++ first) {
           if (!predicate(*first)) {
    //          ^^^^^^^^^
              *result = std::move(*first);
              ++ result;
           }
         }
      }
    }

Note that your predicate is passed by-value to find_if , so the struct, and therefore the set, modified inside find_if will not be propagated back to caller. 请注意,您的谓词按值传递给find_if ,因此结构以及因此在find_if修改的find_if将不会传播回调用者。

Since the first duplicate appears at: 由于第一个副本出现在:

  saaangeetha
//  ^

The initial "sa" will be kept after the find_if call. find_if调用之后,将保留最初的"sa" Meanwhile, the predicate 's set is empty (the insertions within find_if are local). 同时, predicate的集合为空( find_if中的插入是本地的)。 Therefore the loop afterwards will keep the 3rd a . 因此,之后的循环将保持第3个a

   sa | angeth
// ^^   ^^^^^^
// ||   kept by the loop in remove_if
// ||
// kept by find_if

Not really an answer, but as another interesting tidbit to consider, this does work, even though it uses the original functor: 不是一个真正的答案,但作为另一个有趣的小问题,这确实有效,即使它使用原始的仿函数:

#include <set>
#include <iostream>
#include <string>
#include <algorithm>
#include <iterator>

template<typename T>
struct is_repeated {
    std::set<T>  unique;
    bool operator()(T c) { return !unique.insert(c).second; }
}; 
int main() {
    std::string s= "saaangeetha";
    std::remove_copy_if(s.begin(), s.end(), 
                        std::ostream_iterator<char>(std::cout), 
                        is_repeated<char>());
    return 0;
}

Edit: I don't think it affects this behavior, but I've also corrected a minor slip in your functor (operator() should apparently take a parameter of type T, not char ). 编辑:我不认为它会影响这种行为,但我也纠正了你的仿函数中的一个小滑动(operator()显然应该采用类型为T的参数,而不是char )。

I suppose the problem could lie in that the is_repeated functor is copied somewhere inside the implementation of std::remove_if . 我想问题可能在于is_repeated函数被复制到std::remove_if的实现中。 If that is the case, the default copy constructor is used and this in turn calls std::set copy constructor. 如果是这种情况,则使用默认的复制构造函数,然后调用std::set copy构造函数。 You end up with two is_repeated functors possibly used independently. 您最终会得到两个可能独立使用的is_repeated函数。 However as the sets in both of them are distinct objects, they don't see the mutual changes. 但是,由于它们中的集合都是不同的对象,因此它们看不到相互的变化。 If you turn the field is_repeated::unique to a reference, then the copied functor still uses the original set which is what you want in this case. 如果将字段is_repeated::unique转到引用,则复制的is_repeated::unique函数仍然使用原始集合,这是您在这种情况下所需的集合。

Functor classes should be pure functions and have no state of their own. Functor类应该是纯函数,并且没有自己的状态。 See item 39 in Scott Meyer's Effective STL book for a good explanation on this. 有关此问题的详细解释,请参阅Scott Meyer的Effective STL一书中的 39项。 But the gist of it is that your functor class may be copied 1 or more times inside the algorithm. 但它的要点是你的仿函数类可能会在算法中被复制一次或多次。

The other answers are correct, in that the issue is that the functor that you are using is not copyable safe. 其他答案是正确的,因为问题是您使用的仿函数不可复制安全。 In particular, the STL that comes with gcc (4.2) implements std::remove_if as a combination of std::find_if to locate the first element to delete followed by a std::remove_copy_if to complete the operation. 特别是,gcc(4.2)附带的STL将std::remove_if实现为std::find_if的组合,以找到要删除的第一个元素,然后是std::remove_copy_if来完成操作。

template <typename ForwardIterator, typename Predicate>
std::remove_if( ForwardIterator first, ForwardIterator end, Predicate pred ) {
   first = std::find_if( first, end, pred ); // [1]
   ForwardIterator i = it;
   return first == last? first 
          : std::remove_copy_if( ++i, end, fist, pred ); // [2]
}

The copy in [1] means that the first element found is added to the copy of the functor and that means that the first 'a' will be lost in oblivion. [1]中的副本意味着找到的第一个元素被添加到仿函数的副本中,这意味着第一个'a'将在遗忘中丢失。 The functor is also copied in [2], and that would be fine if it were not because the original for that copy is an empty functor. 仿函数也在[2]中复制,如果不是因为该副本的原始版本是空仿函数,那就没问题了。

Depending on the implementation of remove_if can make copies of your predicate. 根据remove_if的实现,可以制作谓词的副本。 Either refactor your functor and make it stateless or use Boost.Ref to "for passing references to function templates (algorithms) that would usually take copies of their arguments", like so: 要么重构你的算子并使其成为无状态,要么使用Boost.Ref来“传递对函数模板(算法)的引用,这些函数通常会复制它们的参数”,如下所示:

#include <set>
#include <iostream>
#include <string>
#include <algorithm>
#include <iterator>

#include <boost/ref.hpp>
#include <boost/bind.hpp>

template<typename T>
struct is_repeated {
    std::set<T>  unique;
    bool operator()(T c) { return !unique.insert(c).second; }
}; 

int main() {
    std::string s= "saaangeetha";
    s.erase(std::remove_if(s.begin(), s.end(), boost::bind<bool>(boost::ref(is_repeated<char>()),_1)), s.end());
    std::cout << s;

    return 0;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM