简体   繁体   中英

vector::erase and reverse_iterator

I have a collection of elements in a std::vector that are sorted in a descending order starting from the first element. I have to use a vector because I need to have the elements in a contiguous chunk of memory. And I have a collection holding many instances of vectors with the described characteristics (always sorted in a descending order).

Now, sometimes, when I find out that I have too many elements in the greater collection (the one that holds these vectors), I discard the smallest elements from these vectors some way similar to this pseudo-code:

grand_collection: collection that holds these vectors
T: type argument of my vector
C: the type that is a member of T, that participates in the < comparison (this is what sorts data before they hit any of the vectors).

std::map<C, std::pair<T::const_reverse_iterator, std::vector<T>&>> what_to_delete;
iterate(it = grand_collection.begin() -> grand_collection.end())
{
     iterate(vect_rit = it->rbegin() -> it->rend())
     {
         // ...
          what_to_delete <- (vect_rit->C, pair(vect_rit, *it))
          if (what_to_delete.size() > threshold)
               what_to_delete.erase(what_to_delete.begin());
         // ...  
     }
}

Now, after running this code, in what_to_delete I have a collection of iterators pointing to the original vectors that I want to remove from these vectors (overall smallest values). Remember, the original vectors are sorted before they hit this code, which means that for any what_to_delete[0 - n] there is no way that an iterator on position n - m would point to an element further from the beginning of the same vector than n , where m > 0 .

When erasing elements from the original vectors, I have to convert a reverse_iterator to iterator. To do this, I rely on C++11's §24.4.1/1:

The relationship between reverse_iterator and iterator is &*(reverse_iterator(i)) == &*(i- 1)

Which means that to delete a vect_rit , I use:

vector.erase(--vect_rit.base());

Now, according to C++11 standard §23.3.6.5/3 :

iterator erase(const_iterator position); Effects: Invalidates iterators and references at or after the point of the erase.

How does this work with reverse_iterators? Are reverse_iterators internally implemented with a reference to a vector's real beginning ( vector[0] ) and transforming that vect_rit to a classic iterator so then erasing would be safe? Or does reverse_iterator use rbegin() (which is vector[vector.size()] ) as a reference point and deleting anything that is further from vector's 0-index would still invalidate my reverse iterator?

Edit:

Looks like reverse_iterator uses rbegin() as its reference point. Erasing elements the way I described was giving me errors about a non-deferenceable iterator after the first element was deleted. Whereas when storing classic iterators (converting to const_iterator ) while inserting to what_to_delete worked correctly.

Now, for future reference, does The Standard specify what should be treated as a reference point in case of a random-access reverse_iterator? Or this is an implementation detail?

Thanks!

From a standardese point of view (and I'll admit, I'm not an expert on the standard): From §24.5.1.1:

namespace std {
    template <class Iterator>
    class reverse_iterator ...
    {
        ...
            Iterator base() const; // explicit
        ...
        protected:
            Iterator current;
        ...
    };
}

And from §24.5.1.3.3:

Iterator base() const; // explicit
    Returns: current.

Thus it seems to me that so long as you don't erase anything in the vector before what one of your reverse_iterator s points to, said reverse_iterator should remain valid.

Of course, given your description, there is one catch: if you have two contiguous elements in your vector that you end up wanting to delete, the fact that you vector.erase(--vector_rit.base()) means that you've invalidated the reverse_iterator "pointing" to the immediately preceeding element, and so your next vector.erase(...) is undefined behavior.

Just in case that's clear as mud, let me say that differently:

std::vector<T> v=...;
...
// it_1 and it_2 are contiguous
std::vector<T>::reverse_iterator it_1=v.rend();
std::vector<T>::reverse_iterator it_2=it_1;
--it_2;

// Erase everything after it_1's pointee:

// convert from reverse_iterator to iterator
std::vector<T>::iterator tmp_it=it_1.base();

// but that points one too far in, so decrement;
--tmp_it;

// of course, now tmp_it points at it_2's base:
assert(tmp_it == it_2.base());

// perform erasure
v.erase(tmp_it);  // invalidates all iterators pointing at or past *tmp_it
                  // (like, say it_2.base()...)

// now delete it_2's pointee:
std::vector<T>::iterator tmp_it_2=it_2.base(); // note, invalid iterator!

// undefined behavior:
--tmp_it_2;
v.erase(tmp_it_2);

In practice, I suspect that you'll run into two possible implementations: more commonly, the underlying iterator will be little more than a (suitably wrapped) raw pointer, and so everything will work perfectly happily. Less commonly, the iterator might actually try to track invalidations/perform bounds checking (didn't Dinkumware STL do such things when compiled in debug mode at one point?), and just might yell at you.

In the question you have already quoted exactly what the standard says a reverse_iterator is:

The relationship between reverse_iterator and iterator is &*(reverse_iterator(i)) == &*(i- 1)

Remember that a reverse_iterator is just an 'adaptor' on top of the underlying iterator ( reverse_iterator::current ). The 'reference point', as you put it, for a reverse_iterator is that wrapped iterator, current . All operations on the reverse_iterator really occur on that underlying iterator. You can obtain that iterator using the reverse_iterator::base() function.

If you erase --vect_rit.base() , you are in effect erasing --current , so current will be invalidated.

As a side note, the expression --vect_rit.base() might not always compile. If the iterator is actually just a raw pointer (as might be the case for a vector ), then vect_rit.base() returns an rvalue (a prvalue in C++11 terms), so the pre-decrement operator won't work on it since that operator needs a modifiable lvalue. See "Item 28: Understand how to use a reverse_iterator 's base iterator " in "Effective STL" by Scott Meyers. (an early version of the item can be found online in "Guideline 3" of http://www.drdobbs.com/three-guidelines-for-effective-iterator/184401406 ).

You can use the even uglier expression, (++vect_rit).base() , to avoid that problem. Or since you're dealing with a vector and random access iterators: vect_rit.base() - 1

Either way, vect_rit is invalidated by the erase because vect_rit.current is invalidated.

However, remember that vector::erase() returns a valid iterator to the new location of the element that followed the one that was just erased. You can use that to 're-synchronize' vect_rit :

vect_rit = vector_type::reverse_iterator( vector.erase(vect_rit.base() - 1));

The reverse_iterator , just like the normal iterator , points to a certain position in the vector. Implementation details are irrelevant, but if you must know, they both are (in a typical implementation) just plain old pointers inside. The difference is the direction. The reverse iterator has its + and - reversed wrt the regular iterator (and also ++ and -- , > and < etc).

This is interesting to know, but doesn't really imply an answer to the main question.

If you read the language carefully, it says:

Invalidates iterators and references at or after the point of the erase.

References do not have a built-in sense of direction. Hence, the language clearly refers to the container's own sense of direction. Positions after the point of the erase are those with higher indices. Hence, the iterator's direction is irrelevant here.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM