简体   繁体   中英

Erasing elements in std::vector by using indexes

I've a std::vector<int> and I need to remove all elements at given indexes (the vector usually has high dimensionality). I would like to know, which is the most efficient way to do such an operation having in mind that the order of the original vector should be preserved.

Although, I found related posts on this issue, some of them needed to remove one single element or multiple elements where the remove-erase idiom seemed to be a good solution. In my case, however, I need to delete multiple elements and since I'm using indexes instead of direct values, the remove-erase idiom can't be applied, right? My code is given below and I would like to know if it's possible to do better than that in terms of efficiency?

bool find_element(const vector<int> & vMyVect, int nElem){
    return (std::find(vMyVect.begin(), vMyVect.end(), nElem)!=vMyVect.end()) ? true : false;
}

void remove_elements(){

    srand ( time(NULL) );

    int nSize = 20;
    std::vector<int> vMyValues;
    for(int i = 0; i < nSize; ++i){
            vMyValues.push_back(i);
    }

    int nRandIdx;
    std::vector<int> vMyIndexes;
    for(int i = 0; i < 6; ++i){
        nRandIdx = rand() % nSize;
        vMyIndexes.push_back(nRandIdx);
    }

    std::vector<int> vMyResult;
    for(int i=0; i < (int)vMyValues.size(); i++){
        if(!find_element(vMyIndexes,i)){
            vMyResult.push_back(vMyValues[i]);
        }
    }
}

I think it could be more efficient, if you just just sort your indices and then delete those elements from your vector from the highest to the lowest. Deleting the highest index on a list will not invalidate the lower indices you want to delete, because only the elements higher than the deleted ones change their index.

If it is really more efficient will depend on how fast the sorting is. One more pro about this solultion is, that you don't need a copy of your value vector, you can work directly on the original vector. code should look something like this:

... fill up the vectors ...

sort (vMyIndexes.begin(), vMyIndexes.end());

for(int i=vMyIndexes.size() - 1; i >= 0; i--){
    vMyValues.erase(vMyValues.begin() + vMyIndexes[i])
}

Erase-remove multiple elements at given indices

Update: after the feedback on performance from @kory, I've modified the algorithm not to use flagging and move/copy elements in chunks (not one-by-one).

Notes:
  • indices need to be sorted and unique
  • uses std::move (replace with std::copy for c++98):

Github Live example

Code:
std::vector<int> v = /*...*/; // vector to remove elements from
std::vector<int> ii = /*...*/; // indices of elements to be removed

// prepare indices
std::sort(ii.begin(), ii.end());
ii.erase(std::unique(ii.begin(), ii.end()), ii.end());

// remove elements at indices
v.erase(remove_at(v.begin(), v.end(), ii.begin(), ii.end()), v.end());
Usage example:
 std::vector<int> v = /*...*/; // vector to remove elements from std::vector<int> ii = /*...*/; // indices of elements to be removed // prepare indices std::sort(ii.begin(), ii.end()); ii.erase(std::unique(ii.begin(), ii.end()), ii.end()); // remove elements at indices v.erase(remove_at(v.begin(), v.end(), ii.begin(), ii.end()), v.end());

to avoid moving the same elements many times, we can move them by ranges between deleted indexes

// fill vMyIndexes, take care about duplicated values
vMyIndexes.push_back(-1); // to handle range from 0 to the first index to remove
vMyIndexes.push_back(vMyValues.size()); // to handle range from the last index to remove and to the end of values
std::sort(vMyIndexes.begin(), vMyIndexes.end());
std::vector<int>::iterator last = vMyValues.begin();
for (size_t i = 1; i != vMyIndexes.size(); ++i) {
    size_t range_begin = vMyIndexes[i - 1] + 1;
    size_t range_end = vMyIndexes[i];
    std::copy(vMyValues.begin() + range_begin, vMyValues.begin() + range_end,   last);
    last += range_end - range_begin;
}
vMyValues.erase(last, vMyValues.end());

PS fixed a bug, thanks to Steve Jessop that patiently tried to show me it

What you can do is split the vector (actually any non-associative container) in two groups, one corresponding to the indices to be erased and one containing the rest.

template<typename Cont, typename It>
auto ToggleIndices(Cont &cont, It beg, It end) -> decltype(std::end(cont))
{
    int helpIndx(0);
    return std::stable_partition(std::begin(cont), std::end(cont), 
        [&](typename Cont::value_type const& val) -> bool {
            return std::find(beg, end, helpIndx++) != end;
    });
}

you can then delete from (or up to) the split point to erase (keep only) the elements corresponding to the indices

std::vector<int> v;
v.push_back(0);
v.push_back(1);
v.push_back(2);
v.push_back(3);
v.push_back(4);
v.push_back(5);

int ar[] = { 2, 0, 4 };
v.erase(ToggleIndices(v, std::begin(ar), std::end(ar)), v.end());
  • If the 'keep only by index' operation is not needed you can use remove_if insted of stable_partition (O(n) vs O(nlogn) complexity)
  • To work for C arrays as containers the lambda function should be [&](decltype(*(std::begin(cont))) const& val) -> bool { return std::find(beg, end, helpIndx++);= end. } but then the .erase() method is no longer an option

If you want to ensure that every element is only moved once, you can simply iterate through each element, copy those that are to remain into a new, second container, do not copy the ones you wish to remove, and then delete the old container and replace it with the new one:)

This is an algorithm based on Andriy Tylychko 's answer so that this can make it easier and faster to use the answer, without having to pick it apart. It also removes the need to have -1 at the beginning of the indices list and a number of items at the end. Also some debugging code to make sure the indices are valid (sorted and valid index into items ).

template <typename Items_it, typename Indices_it>
auto remove_indices(
    Items_it items_begin, Items_it items_end
  , Indices_it indices_begin, Indices_it indices_end
)
{
    static_assert(
      std::is_same_v<std::random_access_iterator_tag
        , typename std::iterator_traits<Items_it>::iterator_category>
      , "Can't remove items this way unless Items_it is a random access iterator");

    size_t indices_size = std::distance(indices_begin, indices_end);
    size_t items_size = std::distance(items_begin, items_end);
    if (indices_size == 0) {
        // Nothing to erase
        return items_end;
    }

    // Debug check to see if the indices are already sorted and are less than
    // size of items.
    assert(indices_begin[0] < items_size);
    assert(std::is_sorted(indices_begin, indices_end));

    auto last = items_begin;
    auto shift = [&last, &items_begin](size_t range_begin, size_t range_end) {
        std::copy(items_begin + range_begin, items_begin + range_end, last);
        last += range_end - range_begin;
    };

    size_t last_index = -1;
    for (size_t i = 0; i != indices_size; ++i) {
        shift(last_index + 1, indices_begin[i]);
        last_index = indices_begin[i];
    }
    shift(last_index + 1, items_size);
    return last;
}

Here is an example of usage:

template <typename T>
std::ostream& operator<<(std::ostream& os, std::vector<T>& v)
{
    for (auto i : v) {
        os << i << " ";
    }
    os << std::endl;
    return os;
}

int main()
{
    using std::begin;
    using std::end;
    std::vector<int> items = { 1, 3, 6, 8, 13, 17 };
    std::vector<int> indices = { 0, 1, 2, 3, 4 };

    std::cout << items;
    items.erase(
          remove_indices(begin(items), end(items), begin(indices), end(indices))
        , std::end(items)
    );
    std::cout << items;

    return 0;
}

Output:

1 3 6 8 13 17 
17 

The headers required are:

#include <iterator>
#include <vector>
#include <iostream> // only needed for output
#include <cassert>
#include <type_traits>

And a Demo can be found on godbolt.org.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM