简体   繁体   中英

What happens to a pointer that points to an element in a vector when I shuffle it?

I have a std::vector<int> and a pointer int* that points to an element in the vector. Let's say the pointer points to the third element: pointer=&vector.at(2) . If I now shuffle the vector, will it still point to the same element (the third) or will it point the the new location where the element which used to be the third has now moved?

After that, I'd like to make the question a little bit more general: How do pointers and iterators to elements in a vector behave when the vector is expanded or reduced?

The pointer will continue to point to the same location , so when you shuffle, it'll point to whatever element has been moved into the location you specified.

When you expand the size of a vector, all existing pointers and iterators into the vector can become invalid. When you shuffle, they continue to refer to the same location, which will (usually) contain a different value than it did before shuffling.

Reducing the size of a vector will depend on exactly how you do that. One way is to create a temporary vector as a copy of the current vector, swap the two, then destroy the temporary (usually implicitly, by letting it go out of scope). If you do this, the pointers will be into the temporary, and be invalidated when it's destroyed.

If you use shrink_to_fit that (probably) won't invalidate iterators/pointers, but may not have any effect (the standard specifies that it's a non-binding request, and doesn't say anything about it invalidating iterators/pointers).

If the vector is shuffled without being resized then the pointer still points to the same location , which will probably contain a different element.

If the vector is resized to be larger, then the pointer is said to be "invalidated" and it has the same status as an uninitialized pointer, ie evaluating it or trying to read through it causes undefined behaviour.

If I now shuffle the vector, will it still point to the same element (the third) or will it point the the new location where the used-to-third element has moved?

Shuffling elements is just a matter of copying/swapping elements through the various "buckets" in the array, while your pointer just points to "that fixed position in memory". So, it'll keep pointing to whatever stays in third position in the array.

Then I like to make the question a little bit more general: How do pointer and iterators to elements in a vector behave when the vector is expanded, reduced or shuffled?

Expand: all iterators/references/pointers may be invalidated.

Reduced: as far as they point to elements before those removed, they are kept valid unless you do a shrink_to_fit . Iterators/pointers to elements you removed are obviously invalid.

Shuffled: you are moving around stuff without causing reallocations, so iterators and references are still valid.

Notice that all this stuff is typically reported in most C++ documentation sources.


The conceptual rule to remember for vectors is that they are just a box around a dynamic array, and iterators and pointers to elements are conceptually the same thing (actually, std::vector<T>::iterator could be a typedef for T * ). The same holds for references (which are pointers in disguise).

If an operation may need reallocate the array (=the array needs to grow, or you explicitly requested it to shrink), then all iterators/pointers/references are going to be invalidated. If you remove elements, then pointers pointing past the "conceptual end" of the vector will point to invalid elements. If the size stays the same, no reallocation needs to occur.

The address will not change, but the value stored at that address will.

#include <iostream>
#include <algorithm>

static void print_vec(const std::vector<int>& v) {
    for (int i = 0; i < v.size();  ++i) {
        std::cout << "Value: " << v.at(i) << " Address: " << &v.at(i) << std::endl;
    }
}

static void shuffle_vec(std::vector<int>& v) {
    auto engine = std::default_random_engine{};
    std::shuffle(v.begin(), v.end(), engine);
}

int main() {
    std::vector<int> v{1, 2, 3, 4, 5};
    std::cout << "Before Shuffle: " << std::endl;
    print_vec(v);
    shuffle_vec(v);
    std::cout << "After Shuffle: " << std::endl;
    print_vec(v);

    return 0;
}

Output:

Before Shuffle: 
Value: 1 Address: 0x16eb320
Value: 2 Address: 0x16eb324
Value: 3 Address: 0x16eb328
Value: 4 Address: 0x16eb32c
Value: 5 Address: 0x16eb330
After Shuffle: 
Value: 3 Address: 0x16eb320
Value: 1 Address: 0x16eb324
Value: 5 Address: 0x16eb328
Value: 4 Address: 0x16eb32c
Value: 2 Address: 0x16eb330

In practice, a vector is a code-maintained contiguous buffer of data. Each element is set up adjacent to the next in an array-like fashion.

When elements move around, in practice, they just move around. Pointers point to locations in that buffer, so if the element moves, in practice the pointer just ends up pointing somewhere else.

However, the C++ standard is more strict. When an iterator is invalidated, so are references and pointers to that location. There are a number of operations that can invalidate an iterator that do not, logically, change the fact that the array is actually going to be the same buffer. For example, if you .erase an element, it invalidates every iterator at that location and afterwards.

In practice a pointer to the element will end up pointing at what was the "next" element in the list, but the standard doesn't guarantee that.

std::shuffle does not invalidate any iterators. It just changes the values stored there. So a pointer to the nth element will both in practice, and in theory, still point to the nth element.

When the vector is expanded, if you expand it beyond .capacity() , all iterators are invalidated. In practice it actually moves the data to a new location, and the pointers are now danging pointers.

When you reduce (via .erase(it) or .erase(a,b) ), all iterators at or after the first argument are invalidated. This means that references and pointers to these elements are also invalidated. In practice, they will now refer to elements "further down the chain" (if such elements exist), as neither .erase will cause your buffer to reallocate, but this is not guaranteed by the standard.

There are other operations that can invalidate iterators. .shrink_to_fit() can, as can vector<X>(vec).swap(vec) (the C++03 version of shrink-to-fit), and .reserve() and operations that grow the size beyond .capacity() .

The operations that cause .capcity() to change will actually make your pointers invalid (or those that make the pointers point beyond-the-end) in practice and in theory.

Read the documentation for every function you call. If you don't know when and how you may call it and what it does then why are you using it?

In general you cannot rely on implementation notions like addresses or arrays and you cannot rely on a test program. You must read when an iterator is or is not invalidated for what elements for a particular container, iterator and operator.

vector::shrink_to_fit invalidates all iterators
vector::resize to same or smaller invalidates exactly the iterators beyond the new size
vector::resize to larger invalidates all iterators

From the C++14 standard [iterator.requirements.general]:

[P]ointers are iterators. The effect of dereferencing an iterator that has been invalidated is undefined.

http://en.cppreference.com/w/cpp/container/vector

std::vector is a sequence container that encapsulates dynamic size arrays.
The elements are stored contiguously, which means that elements can be accessed not only through iterators, but also using offsets on regular pointers to elements.

iterator RandomAccessIterator

Iterator invalidation
swap, std::swap Never
shrink_to_fit Always
resize If the vector changed capacity, all of them. If not, only those after the insertion point.

http://en.cppreference.com/w/cpp/container/vector/resize

Vector capacity is never reduced when resizing to smaller size because that would invalidate all iterators, rather than only the ones that would be invalidated by the equivalent sequence of pop_back() calls.

After vector::shuffle iterators/pointers are unchanged but dereference to new values.

Because shuffle uses swap which leaves iterators unchanged:

http://en.cppreference.com/w/cpp/algorithm/random_shuffle

 template< class RandomIt, class URNG > void shuffle( RandomIt first, RandomIt last, URNG&& g ); 

RandomIt must meet the requirements of ValueSwappable and RandomAccessIterator.

http://en.cppreference.com/w/cpp/concept/Iterator

Iterator is the base concept used by other iterator types: InputIterator, OutputIterator, ForwardIterator, BidirectionalIterator, and RandomAccessIterator. Iterators can be thought of as an abstraction of pointers. [...]

`- lvalues of type It satisfy Swappable [...]

http://en.cppreference.com/w/cpp/concept/ValueSwappable

Type T is ValueSwappable if
1) Type T satisfies the Iterator requirements
2) For any dereferencable object x of type T (that is, any value other than the end iterator), *x satisfies the Swappable requirements.

http://en.cppreference.com/w/cpp/concept/Swappable

 using std::swap; swap(u, t); 

After the call, the value of t is the value held by u before the call, and the value of u is the value held by t before the call.

As people have mentioned, the pointer is "pointing" to a location in memory, regardless of the contents that are there. There is actually some really interesting stuff you can do, like having an array of 5 elements, but printing out the value at position 6, even though it is not within the scope of your array. By accessing an array with something like array[5] when you have only declared it to be 5 elements long, you will end up with undefined behaviour, essentially meaning that a variety of things could happen, with each run potentially returning something completely different. See philipxy's comments below for some very useful links delving into this concept.

So with that out of the way, here is a little bit of code you could test to actually see this effect.

#include <vector>
#include <iostream>

using namespace std;

int main()
{
    vector<int> values (5);

    for (int i = 0; i < 5; i++)
        values[i] = i;

    for (int i = 0; i < 5; i++)
        cout << values[i] << " ";

    //Initialise the pointer so that it is pointing at the first element in vector
    int* pointer = &values[0];

    //By incrementing, we expect it to be pointing at the second element, which should be 1
    pointer++;

    cout << endl << "Pointer " << *pointer << endl;

    //Reverse the order of the vector
    reverse(values.begin(), values.end());

    for (int i = 0; i < 5; i++)
        cout << values[i] << " ";

    cout << endl << "Pointer " << *pointer << endl; 

    return 0;
}

The result of this code is:

代码输出结果

So we can see that the pointer hasn't actually changed where it is pointing, but that cell in memory has been altered, so dereferencing the pointer will yield a different result.

It completely depends on how "std::vector" is implemented. I am not sure if there are any guarantees about that.

EDIT: I just learned, that the C++ standard indeed is a lot stricter than I thought (thanks to philipxy). It does specify that vector must internally behave like a C Array. See

http://herbsutter.com/2008/04/07/cringe-not-vectors-are-guaranteed-to-be-contiguous/

So forget about the rest, at least if you have an implementation which conforms to at least C++03.

If you would for example implement std::vector as a linked list (not likely), then shuffling, reducing size, etc will not do anything.

If the "std::vector" internally uses something like "int []" to store its elements (likely), then reshuffling the elements will probably mean that your pointer will now point to a different value then before (what Stevo tried out).
If you resize your vector in this case, then again it depends entirely on internal implementation. A resize might allocate a new "int []" and copy the old content over. In this case your pointer will point to now unallocated memory, so all havoc might break loose.
If you are lucky (depending on the implementation) then a shrinking or growing of the vector by a "small" amount might not do anything (your pointer is still valid).

Summary: Don't do that ;-) (using pointers and afterwards modifying your container).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM