简体   繁体   中英

Difference in values between pointer access and array access

Can someone please clarify an incorrect interpretation on my part? I know that my understanding is incorrect due my code's resulting output (see the bottom of the question). Thanks in advance.

To clarify, what does each segment of the following line mean?:

*(u8 *)((u32)BufferAddress + (u32)i)

And how does it differ from the following line:

*(u32 *)((u32)BufferAddress + (u32)i)

My interpretation of the above is:

  1. segment1 = ((u32)BufferAddress + (u32)i) => determine an address as an integer.
  2. segment2 = (u32 *)(segment1) => cast the address to be treated like a pointer, where the pointer is 32 bits in length.
  3. segment3 = *(segment2) => dereference the pointer in order to obtain the value of residing at the calculated address.

What is incorrect about my interpretation? I think my lack of understanding is in the segment2 area... What is the difference between casting (u32 *) and (u8 *)?

Here is the code that made me realize I have a knowledge-gap:

main(...) {
     ...
     u8 *Buffer = malloc(256);
     ...
     Buffer[0] = 1;
     Buffer[1] = 0;
     Buffer[2] = 0;
     Buffer[3] = 4;
     Buffer[4] = 0;
     Buffer[5] = 0;
     qFunction(... , Buffer, 6, ...);
     ...
}

qFunction(... , const u8 *BufferPointer, u32 BufferLength, ...) {
     u32 BufferAddress;
     ...
     BufferAddress = (u32) BufferPointer;
     ...

     /* Method 1: */
     for (i=0; i < BufferLength; i++)
          printf("%d, %p\n", BufferPointer[i], &BufferPointer[i]);

     /* Method 2: */
     for (i=0; i < BufferLength; i++)
          printf("%d, 0x%lx\n", *(u8 *)(BufferAddress+i), BufferAddress+i);

     /* Method 3: */
     for (i=0; i < BufferLength; i++)
          printf("%d, 0x%lx\n", *(u32 *)(BufferAddress+i), BufferAddress+i);
     ...
 }

The outputs of Method 1 and Method 2 are as I expect (both are the same):

1, 0x1000000
0, 0x1000001
0, 0x1000002
4, 0x1000003
0, 0x1000004
0, 0x1000005

However, the output of Method 3 seems weird to me; only part of the result is the same as Method 1/2:

-1442840511, 0x1000000
11141120, 0x1000001
43520, 0x1000002
4, 0x1000003
0, 0x1000004
0, 0x1000005

I'd appreciate any tips or references to reading material. Thanks.

*(u8 *)((u32)BufferAddress + (u32)i)
*(u32 *)((u32)BufferAddress + (u32)i)

The top line casts the pointer to an unsigned 8 bit value before dereferencing, while the seconds casts it to an unsigned 32 bit value before dereferencing. The top line derefences a single byte and the bottom one dereferences an entire 4 bytes.

To address your other question:

What is incorrect about my interpretation? I think my lack of understanding is in the segment2 area... What is the difference between casting (u32 *) and (u8 *)?

The interpretation of the address being 32 bits in length is true for both top and bottom lines of code.

I could nitpick, and say "You didn't give us enough information". Technically this is true, but it just requires making some assumptions. u8 and u32 aren't standard C types, and you could have them typedefed to anything, but presumably, they represent an unsigned 8 bit value (eg uchar ) and an unsigned 32 bit value (eg unsigned ). Assuming that, let's look at the ones you understand, and explain where that leaves the third one.

BufferPointer is a const u8*, which means it's a constant pointer of type u8. That means the array it's pointing to is of type 8-bit unsigned.

Now, BufferAddress is a u32 - that's typical for pointers, at least on a 32 bit system. Since they're always the size of the bus, on a 64bit system, pointers are 64 bits.

So, Method1 is printing the elements of the array and the address of the array. That's fine and cool.

Method2:

*(u8 *)(BufferAddress+i), BufferAddress+i

BufferAddress is an unsigned integer, and you're adding values to it to get the other addresses. That's a basic point of arrays - the memory will be contiguous, and you access the next element by advancing the number of bytes of each element. Since it's a u8 array, you just advance by 1. Here's a catch, though - if it was an array of ints, you'd want BufferAddress+(i*4), not BufferAddress+i, because the size of each int is 4 bytes. Incidentally, this is how pointer arithmetic works in C. If you did `&(((u32 *)BufferAddress) + 1) you'd get 0x100004 instead of 0x100001, because you casted BufferAddress to a 4byte pointer, and the compiler knows that when you're looking at the next element, it has to be 4 bytes away.

So (BufferAddress+i) is the address of the ith element of the u8 array. The (u8 *) casts BufferAddress from a boring integer to a pointer to memory locations of type u8, so that when you do *(u8 *) , the compiler knows to treat it as a u8. You could do (u64 *) , and the compiler would say "Oh! This area of memory is 64 bit", and attempt to interpret the values in that way.

Which might make it clear what's happening in Method 3 now. You're getting the appropriate addresses of each array element, but you're telling the compiler "treat this area of memory as 32 bit data". So each time you use *(u32 *), you're reading 4 bytes of the array and treating it as an unsigned int. Incidentally, once i >= 3, you're hitting undefined behavior, because you're reading outside of the array.

Let me try to give a visualization of what your memory in that area looks like:

0x1000000 = 1
0x1000001 = 0
0x1000002 = 0
0x1000003 = 4
0x1000004 = 0
0x1000005 = 0

For method2, when i = 2, you're looking at BufferAddress (=0x1000000) + 3, ie 0x1000002, which has the number 0 in it. The compiler knows it's only one byte, so goes with that.

But for method3, when i = 3, you're telling the compiler to treat it as 32 bits. So it doesn't see '0', it see 0, 4, 0, 0, and uses these numbers to come up with an integer value, which will definitely not be 4.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM