简体   繁体   中英

Decrementing std::vector::iterator before std::vector::begin()

Following one of the "deleting while iterating" patterns on a vector, I don't understand why this code works, or if it's making use of undefined behavior:

The Code :

#include <vector>
#include <iostream>

int main(int argc, char* argv[], char* envz[])
{
  std::vector<std::string> myVec;
  myVec.push_back("1");
  myVec.push_back("2");
  myVec.push_back("3");

  for (std::vector<std::string>::iterator i = myVec.begin();
       i != myVec.end();
       ++i)
  {
    if ("1" == *i)
    {
      std::cout << "Erasing " << *i << std::endl;
      i = myVec.erase(i);
      --i;
      continue;
    }
    std::cout << *i << std::endl;
  }

  return 0;
}

The Output :

>g++ -g main.cpp
>./a.out 
Erasing 1
2
3

Question :

Consider the first iteration of the for-loop:

  • i is myVec.begin(), which "points to" 1 .
  • We enter the conditional block.
  • 1 is erased and i is set to one past the erased element, ie 2 , which is now also pointed to by myVec.begin()
  • I decrement i , so now it points to...one prior to myVec.begin() ???

I'm confused by why this seems to work, as evidenced by the output, but something feels fishy about decrementing the iterator. This code is easy enough to rationalize if the conditional is if ("2" == *i) , because the iterator decrement still places it at a valid entry in the vector. Ie if we conditionally erased 2 , i would be set to point to 3 , but then manually decremented and thus point to 1 , followed by the for-loop increment, setting it to point back to 3 again. Conditionally erasing the last element is likewise easy to follow.

What Else I Tried :

This observation made me hypothesize that decrementing prior to vector::begin() was idempotent, so I tried addition an additional decrement, like so:

#include <vector>
#include <iostream>

int main(int argc, char* argv[], char* envz[])
{
  std::vector<std::string> myVec;
  myVec.push_back("1");
  myVec.push_back("2");
  myVec.push_back("3");

  for (std::vector<std::string>::iterator i = myVec.begin();
       i != myVec.end();
       ++i)
  {
    if ("1" == *i)
    {
      std::cout << "Erasing " << *i << std::endl;
      i = myVec.erase(i);
      --i;
      --i;      /*** I thought this would be idempotent ***/
      continue;
    }
    std::cout << *i << std::endl;
  }

  return 0;
}

But this resulted in a segfault:

Erasing 1
Segmentation fault (core dumped)

Can someone explain why the first code bock works, and specifically why the single decrement after erasing the first element is valid?

No, your code has undefined behaviour: if i == myVec.begin() , then i = myVec.erase(i); results in i again being (the new value of) myVec.begin() , and --i has undefined behaviour since it goes outside the valid range for the iterator.

If you don't want to use the erase-remove idiom (ie myVec.erase(std::remove(myVec.begin(), myVec.end(), "1"), myVec.end()) ), then the manual loop-while-mutating looks like this:

for (auto it = myVec.begin(); it != myVec.end(); /* no increment! */) {
  if (*it == "1") {
    it = myVec.erase(it);
  } else {
    ++it;
  }
}

Regardless, the crucial point both here and in your original code is that erase invalidates iterators, and thus the iterator must be re-assigned with a valid value after the erasing. We achieve this thanks to the return value of erase , which is precisely that new, valid iterator that we need.

This might work in some compilers, but might fail in others (eg the compiler might actually check in runtime that you are not decrementing under begin() and throw exception in such case - I believe that at least one compiler does it but don't remember which one).

In this case the general pattern is to not increment in the for but inside the loop:

  for (std::vector<std::string>::iterator i = myVec.begin();
       i != myVec.end();
       /* no increment here */)
  {
    if ("1" == *i)
    {
      std::cout << "Erasing " << *i << std::endl;
      i = myVec.erase(i);
      continue;
    }
    std::cout << *i << std::endl;
    ++i;
  }

With vector the wrong iteration might actually work in more cases, but you'd have very bad time if you try that eg with std::map or std::set .

The key here is the continue right after decrementing. By calling it, ++i will be triggered by the loop iteration before dereferencing i .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM