简体   繁体   English

在成对的“std::vector”上使用“std::count”的意外行为

[英]Unexpected behavior using `std::count` on `std::vector` of pairs

My goal is to completely remove all elements in a std::vector<std::pair<int, int>> that occur more than once.我的目标是完全删除std::vector<std::pair<int, int>>中多次出现的所有元素。

The idea was to utilize std::remove with std::count as part of the predicate.这个想法是利用std::removestd::count作为谓词的一部分。 My approach looks something like this:我的方法看起来像这样:

#include <iostream>
#include <vector>
#include <algorithm>

using std::cout;
using std::endl;
using i_pair = std::pair<int, int>;

int main()
{
    std::vector<i_pair> vec;
    vec.push_back(i_pair(0,0)); // Expected to stay
    vec.push_back(i_pair(0,1)); // Expected to go
    vec.push_back(i_pair(1,1)); // Expected to stay
    vec.push_back(i_pair(0,1)); // Expected to go

    auto predicate = [&](i_pair& p)
    {
        return std::count(vec.begin(), vec.end(), p) > 1;
    };
    auto it = std::remove_if(vec.begin(), vec.end(), predicate);

    cout << "Reordered vector:" << endl;
    for(auto& e : vec)
    {
        cout << e.first << " " << e.second << endl;;
    }
    cout << endl;
    
    cout << "Number of elements that would be erased: " << (vec.end() - it) << endl;

    return 0;
}

The array gets reordered with both of the (0,1) elements pushed to the end, however the iterator returned by std::remove points at the last element.数组被重新排序,两个(0,1)元素都被推到末尾,但是std::remove返回的迭代器指向最后一个元素。 This means that a subsequent erase operation would only get rid of one (0,1) element.这意味着后续的erase操作只会删除一个(0,1)元素。

Why is this behavior occurring and how can I delete all elements that occur more than once?为什么会发生这种行为,如何删除多次出现的所有元素?

Your biggest problem is std::remove_if gives very little guarantees about the contents of the vector while it is running.您最大的问题是std::remove_if在运行时对向量的内容几乎没有保证。

It guarantees at the end, begin() to returned iterator contains elements not removed, and from there until end() there are some other elements.它保证最后, begin()到返回的迭代器包含未删除的元素,并且从那里直到end()还有一些其他元素。

Meanwhile, you are iterating over the container in the middle of this operation.同时,您在此操作的中间迭代容器。

It is more likely that std::partition would work, as it guarantees (when done) that the elements you are "removing" are actually stored at the end. std::partition更有可能起作用,因为它保证(完成后)您“删除”的元素实际上存储在最后。

An even safer one would be to make a std::unordered_map<std::pair<int,int>, std::size_t> and count in one pass, then in a second pass remove everything whose count is at least 2. This is also O(n) instead of your algorithms O(n^2) so should be faster.一个更安全的方法是制作一个std::unordered_map<std::pair<int,int>, std::size_t>并在一遍中计数,然后在第二遍中删除计数至少为 2 的所有内容。这也是 O(n) 而不是你的算法 O(n^2) 所以应该更快。

std::unordered_map<i_pair,std::size_t, pair_hasher> counts;
counts.reserve(vec.size()); // no more than this
for (auto&& elem:vec) {
  ++counts[elem];
}
vec.erase(std::remove_if(begin(vec), end(vec), [&](auto&&elem){return counts[elem]>1;}), end(vec));

you have to write your own pair_hasher .您必须编写自己的pair_hasher If you are willing to accept nlgn performance, you could do如果你愿意接受 nlgn 的表现,你可以这样做

std::map<i_pair,std::size_t> counts;
for (auto&& elem:vec) {
  ++counts[elem];
}
vec.erase(std::remove_if(begin(vec), end(vec), [&](auto&&elem){return counts[elem]>1;}), end(vec));

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM