简体   繁体   中英

Self-referential pointer arithmetic

So given the following code:

#include <iostream>
#include <vector>

int main(int argc, char* argv[]) {
    int i = 42;
    int* p = &i;

    std::cout << "*p: " << *p << std::endl;
    std::cout << "&p: " << &p << std::endl;
    std::cout << "p: " << p << std::endl;
    std::cout << "p + 1: " << (p + 1) << std::endl;
    std::cout << "p + 1: " << ((p + 1) == (int*)(&p)) << std::endl;
    std::cout << "*(p + 1): " << *(p + 1) << std::endl;
    return 0; 
}

It might produce the following output:

*p: 42
&p: 0x7fff38d8a888
p: 0x7fff38d8a884
p + 1: 0x7fff38d8a888
p + 1: 1
*(p + 1): 953723012

Is (p + 1) a pointer to the memory location p is stored in? Is it possible to get the value pointed by p by this way?

In your example (p + 1) does not point to any storage you have allocated, so dereferencing it produces undefined behavior and should be avoided.

EDIT: Also, your second output for (p + 1) itself is unreliable, since pointer arithmetic should be used only if the pointer is a pointer to an array. Consequently, the expression evaluates to false on my machine.

p is the pointer to an int object. &p is the address of p.

The stack from your example looks like:

Address        Type       Name        Value
0x7fff38d8a884 int        i           42
0x7fff38d8a888 int*       p           0x7fff38d8a884

The way that the stack has been setup, the address of p is right after the address of i. In this particular case, when you added 1 to p, it moved 4 bytes down and found the value there, which happens to be the address to i.

What is happening in the line

std::cout << "p + 1: " << ((p + 1) == (int*)(&p)) << std::endl;

is p+1 --> compiler gets address for the "second element" of array p (int*)(&p) --> &p is an int** , but is being cast to an int* , int this particular instance, that happens to be the same as the value stored in p + 4 bytes

What is happening in the line

std::cout << "*(p + 1): " << *(p + 1) << std::endl;

is *(p+1) --> compiler accesses the "second element" of array p, because you are likely using an x86_64 system, which is little endian, the hex value stored there is 0x38D8A884, the lower half of the pointer stored in p (which converts to 953723012 in decimal),.

If you remember that pointers and arrays can be used interchangeably, you might figure out that eg

p[1]

is the same as

*(p + 1)

That means that the expression (p + 1) is a pointer to the int value after p . As p doesn't point to an array, it means that (p + n) for a positive n is a pointer to something you haven't allocated (it's out of bounds), and reading that value leads to undefined behavior. Assigning to it is also undefined behavior, and can even overwrite other variables data.

To get the address of where p is stored, you use the address-of operator: &p . That returns a pointer to the pointer (ie of type int ** ).

While the standard gives you no guarantee that ((p + 1) == (int*)(&p)) you seem to be lucky here.

Yet since you are on a 64-bitmachine when dereferencing (p+1) you get only the lower 32 bits of p.

0x38D8A884 == 953723012

The right hand side of the equation is the output that you received. The left hand side is the lower 32 bits of p as witnessed by the output of your program.

No.

Pointer arithmetic, although unchecked, is very limited by the Standard. In general, it should only be used within an array, and you may use it to point to either an array element or one past the end of the array. Furthermore, although pointing one past the end of an array is allowed, the so-obtained pointer is a sentinel value which should not be dereferenced.


So, what is it that you observe ? Simply put, &p , p + 1 , etc... are temporary expressions whose result have to be materialized somewhere. With optimizations on, said results would probably be materialized in CPU registers, but without they are materialized on the stack within the function frame (in general).

Of course, this location is not prescribed by the Standard, so trying to obtain it produces undefined behavior ; and even though it appears to work on your compiler with this set of compiling options means nothing for any other compiler or even this very same compiler with any other set of options.

That is the true meaning of undefined behavior : it does not mean the program crashes, it just means anything may happen and this encompasses the seems to work situations.

It is a random case that p + 1 is equal to &p. It takes place only in such code as yours where pointer p follows the object it points to. That is the address of p itself is sizeof( int ) greater than the address of the object it points to. If you for example will insert one more definition between i and p then the equation p + 1 == &p will not be valid. For example

int i = 42;
int j = 62;
int* p = &i;

p just so happened to get allocated on the stack at the address right after (well 4 bytes after) the address of the integer i. some_ptr+1 (which is really some_ptr+1*sizeof(int)) is not a consistent way to get the address of some_ptr, it is just a coincidence in this case.

so to answer your question some_ptr+1 != &some_ptr

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM