My goal is to completely remove all elements in a std::vector<std::pair<int, int>>
that occur more than once.
The idea was to utilize std::remove
with std::count
as part of the predicate. My approach looks something like this:
#include <iostream>
#include <vector>
#include <algorithm>
using std::cout;
using std::endl;
using i_pair = std::pair<int, int>;
int main()
{
std::vector<i_pair> vec;
vec.push_back(i_pair(0,0)); // Expected to stay
vec.push_back(i_pair(0,1)); // Expected to go
vec.push_back(i_pair(1,1)); // Expected to stay
vec.push_back(i_pair(0,1)); // Expected to go
auto predicate = [&](i_pair& p)
{
return std::count(vec.begin(), vec.end(), p) > 1;
};
auto it = std::remove_if(vec.begin(), vec.end(), predicate);
cout << "Reordered vector:" << endl;
for(auto& e : vec)
{
cout << e.first << " " << e.second << endl;;
}
cout << endl;
cout << "Number of elements that would be erased: " << (vec.end() - it) << endl;
return 0;
}
The array gets reordered with both of the (0,1)
elements pushed to the end, however the iterator returned by std::remove
points at the last element. This means that a subsequent erase
operation would only get rid of one (0,1)
element.
Why is this behavior occurring and how can I delete all elements that occur more than once?
Your biggest problem is std::remove_if
gives very little guarantees about the contents of the vector while it is running.
It guarantees at the end, begin()
to returned iterator contains elements not removed, and from there until end()
there are some other elements.
Meanwhile, you are iterating over the container in the middle of this operation.
It is more likely that std::partition
would work, as it guarantees (when done) that the elements you are "removing" are actually stored at the end.
An even safer one would be to make a std::unordered_map<std::pair<int,int>, std::size_t>
and count in one pass, then in a second pass remove everything whose count is at least 2. This is also O(n) instead of your algorithms O(n^2) so should be faster.
std::unordered_map<i_pair,std::size_t, pair_hasher> counts;
counts.reserve(vec.size()); // no more than this
for (auto&& elem:vec) {
++counts[elem];
}
vec.erase(std::remove_if(begin(vec), end(vec), [&](auto&&elem){return counts[elem]>1;}), end(vec));
you have to write your own pair_hasher
. If you are willing to accept nlgn performance, you could do
std::map<i_pair,std::size_t> counts;
for (auto&& elem:vec) {
++counts[elem];
}
vec.erase(std::remove_if(begin(vec), end(vec), [&](auto&&elem){return counts[elem]>1;}), end(vec));
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.