简体   繁体   中英

Unexpected behavior using `std::count` on `std::vector` of pairs

My goal is to completely remove all elements in a std::vector<std::pair<int, int>> that occur more than once.

The idea was to utilize std::remove with std::count as part of the predicate. My approach looks something like this:

#include <iostream>
#include <vector>
#include <algorithm>

using std::cout;
using std::endl;
using i_pair = std::pair<int, int>;

int main()
{
    std::vector<i_pair> vec;
    vec.push_back(i_pair(0,0)); // Expected to stay
    vec.push_back(i_pair(0,1)); // Expected to go
    vec.push_back(i_pair(1,1)); // Expected to stay
    vec.push_back(i_pair(0,1)); // Expected to go

    auto predicate = [&](i_pair& p)
    {
        return std::count(vec.begin(), vec.end(), p) > 1;
    };
    auto it = std::remove_if(vec.begin(), vec.end(), predicate);

    cout << "Reordered vector:" << endl;
    for(auto& e : vec)
    {
        cout << e.first << " " << e.second << endl;;
    }
    cout << endl;
    
    cout << "Number of elements that would be erased: " << (vec.end() - it) << endl;

    return 0;
}

The array gets reordered with both of the (0,1) elements pushed to the end, however the iterator returned by std::remove points at the last element. This means that a subsequent erase operation would only get rid of one (0,1) element.

Why is this behavior occurring and how can I delete all elements that occur more than once?

Your biggest problem is std::remove_if gives very little guarantees about the contents of the vector while it is running.

It guarantees at the end, begin() to returned iterator contains elements not removed, and from there until end() there are some other elements.

Meanwhile, you are iterating over the container in the middle of this operation.

It is more likely that std::partition would work, as it guarantees (when done) that the elements you are "removing" are actually stored at the end.

An even safer one would be to make a std::unordered_map<std::pair<int,int>, std::size_t> and count in one pass, then in a second pass remove everything whose count is at least 2. This is also O(n) instead of your algorithms O(n^2) so should be faster.

std::unordered_map<i_pair,std::size_t, pair_hasher> counts;
counts.reserve(vec.size()); // no more than this
for (auto&& elem:vec) {
  ++counts[elem];
}
vec.erase(std::remove_if(begin(vec), end(vec), [&](auto&&elem){return counts[elem]>1;}), end(vec));

you have to write your own pair_hasher . If you are willing to accept nlgn performance, you could do

std::map<i_pair,std::size_t> counts;
for (auto&& elem:vec) {
  ++counts[elem];
}
vec.erase(std::remove_if(begin(vec), end(vec), [&](auto&&elem){return counts[elem]>1;}), end(vec));

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM