简体   繁体   中英

How can I understand the output of this program?

My book is attempting to familiarize me with concepts such as pointer dereferencing concerning structures and some weird ways of accessing structures. I am a newbie, and find the following confusing about the code below.

#include <stdio.h>
#include <time.h>
void dump_time_struct_bytes(struct tm *time_ptr, int size) {
    int i;
    unsigned char *raw_ptr;

    printf("bytes of struct located at 0x%08x\n", time_ptr);
    raw_ptr = (unsigned char *)time_ptr;
    for (i = 0; i < size; i++)
    {
        printf("%02x ", raw_ptr[i]);
        if (i % 16 == 15) // Print a newline every 16 bytes.
            printf("\n");
    }
    printf("\n");
}
int main() {

    long int seconds_since_epoch;
    struct tm current_time, *time_ptr;
    int hour, minute, second, i, *int_ptr;

    seconds_since_epoch = time(0); // Pass time a null pointer as argument.
    printf("time() - seconds since epoch: %ld\n", seconds_since_epoch);
    time_ptr = &current_time; // Set time_ptr to the address of
                              // the current_time struct.
    localtime_r(&seconds_since_epoch, time_ptr);

    // Three different ways to access struct elements:
    hour = current_time.tm_hour; // Direct access
    minute = time_ptr->tm_min; // Access via pointer
    second = *((int *)time_ptr); // Hacky pointer access
    printf("Current time is: %02d:%02d:%02d\n", hour, minute, second);
    dump_time_struct_bytes(time_ptr, sizeof(struct tm));

    minute = hour = 0; // Clear out minute and hour.

    int_ptr = (int *)time_ptr;
    for (i = 0; i < 3; i++) {
        printf("int_ptr @ 0x%08x : %d\n", int_ptr, *int_ptr);
        int_ptr++; // Adding 1 to int_ptr adds 4 to the address,
    } // since an int is 4 bytes in size.
}

Output:

time() - seconds since epoch: 1189311744
Current time is: 04:22:24
bytes of struct located at 0xbffff7f0
18 00 00 00 16 00 00 00 04 00 00 00 09 00 00 00
08 00 00 00 6b 00 00 00 00 00 00 00 fb 00 00 00
00 00 00 00 00 00 00 00 28 a0 04 08
int_ptr @ 0xbffff7f0 : 24
int_ptr @ 0xbffff7f4 : 22
int_ptr @ 0xbffff7f8 : 4
  1. i. I understand that the author has redeclared *time_ptr as a pointer to unsigned char, but how did it manage to become an array (character array, I think)? I think that this might be to do with the fact that arrays are interpreted as pointers which point to their 0th elements, but I am not sure.

    ii. Secondly, what is the output from the dump_time_struct_bytes function (the dumped bytes)? I understand that thats the bytes from the structure, but I dont know how they are supposed to make up the 4 hours, 22 minutes and 24 seconds stored in it (if this is the case at all). Also, what does the address of *time_ptr correspond to? Is it the start of the structure? If the latter is true, do the corresponding dumped bytes in the output belong only to its first element (tm_sec) or to the whole structure ?

  2. The explanation for the "hacky pointer" was a bit weird- why does dereferencing a converted integer pointer solely reveal the contents of the first element in the structure- tm_sec?

Thank you in advance.

"I understand that the author has redeclared *time_ptr as a pointer to unsigned char, but how did it manage to become an array (character array, I think)?"

Pointers point to memory. Memory is an array of bytes. How many bytes a pointer points to depends on the interpretation (type) of the thing pointed at. Beyond that simple fact the compiler doesn't do bounds checking in C/C++. So in essence EVERY pointer is a pointer to an array of elements of the type the pointer points at. So pointer to unsigned char is a pointer to an array of single byte chars. A pointer to a structure is a pointer to an array of elements that are each as long as one structure's size.

So a pointer to a single structure IS a pointer to an array of size 1. Nothing in the language prevents the code from being bad and trying to access an element at the next location.

This is both the power and curse of pointers. And the source of many bugs and security problems in C/C++. It's also why you can do a lot of cool things in the language efficiently.

"With great power comes great responsibility."

So this code interprets the struct pointer first as an array of bytes and prints the hex dump, then as an array of integers. When processing the pointer as a int*, the single increment operation moves by 4 bytes.

Hence the first element is 0x00000018 (little endian for the 4 bytes: 18 00 00 00). 0x18 hex is 24.

The second integer is 0x00000016 (little endian for 16 00 00 00) = 22.

Etc.

Note that the int* moves by 4 bytes because in your particular compiler, sizeof(int) == 4 . "int" is a special type and can change size based on your compiler. If you had a different compiler (say for an embedded micro controller), then sizeof(int) might be 2 and the integers would print out as 24, 0, 22 (assuming the exact same memory block).

Is the size of C "int" 2 bytes or 4 bytes?

=== in response to comment ===

"(Accidentally commented somewhere else) Thank you for your answer. However, there is one thing which seems a bit unclear. Let's say I have a pointer to a char 'c'. Is the pointer now a pointer to a char array of size 1?

YES. A byte array of one byte.

Also, just to verify, you've mentioned that a pointer to a single structure is a pointer to an array of size one.

YES, but in this case the size of a single element in the array is sizeof(mystruct) , which is likely more than a single byte.

Typecasting that pointer to a pointer to char will therefore result in the array size now being larger than 1 and being an array of bytes, responsible for the hex dump.

YES.

Hence should any pointer when typecasted in such a manner result in this nice byte breakup?

YES. This is how byte/memory dumps work.

One more thing about the sizeof(type) keyword. sizeof(type) reports the size (in bytes) of an instance of type . sizeof(variable) is equivalent to the sizeof(type-of-variable). This has a subtle behavior when variable is a pointer or array. For example:

char c = '0'   // in memory this is the single byte 0x30
char str[] = { 0x31, 0x32, 0x00 }; // an array of bytes 0x31, 0x32, 0x00

sizeof(char) == sizeof(c) == 1
sizeof(str) == 3 // compiler knows the array was initialized to 3 bytes
sizeof(p) == 4 // assuming your compiler is using 32-bit pointers.  On a 64-bit machine this would be 8.

char* p = &c;  //  note that assigning a pointer to the address of a variable requires the address-of operator (&)

sizeof(*p) == 1 // this is the size of the thing pointed to.

p = str; // note that assigning an ARRAY variable name to a pointer does not require address-of (because the name of an array IS a pointer - they *are* the same type in all ways except with respect to sizeof() where sizeof() knows the size of an initialized array.)

sizeof (*p) == 1; // even though p was assigned to str - an array - sizeof still returns the answer based on the type of the thing p is pointing to - in this case a single char.  This is subtle but important.  p points to a single character in the array.

// Thus at this point, p points to 0x31.
p++; // p advances in memory by sizeof(*p), now points at 0x32.
p++; // p advances in memory by sizeof(*p), now points at 0x00.
p++; // p advances in memory by sizeof(*p), now points BEYOND THE ARRAY.

IMPORTANT - Because the pointer was advanced past the end of the array, at this point p points to possibly invalid memory OR it might point to some other random variable in memory. This can result in a crash (in the case of invalid memory), or a bug and memory corruption (and a likely security bug) if it points to "valid" memory that isn't being used as expected. In this specific case where the variables are assumed to live on the stack it points to a variable or perhaps the return address of the function. Either way, going beyond the array is BAD. VERY VERY BAD. and the COMPILER WON'T STOP YOU!!!

Also, by the way - sizeof is NOT a function. It is evaluated by the compiler at compile time based on the compiler's symbol table. Therefore there is no way to get the size of an array allocated like this:

char* p = malloc(sizeof(char)*100);

The compiler doesn't realize that you're allocating 100 bytes because malloc is a run-time function. (indeed, 100 is usually a variable with a changing value). Therefore sizeof(p) will return the sizeof a pointer (either 4 or 8 as mentioned earlier), and sizeof(*p) will return sizeof(char) , which is 1. In a case like this the code has to remember how much memory was allocated in a separate variable (or in some other way - dynamic allocation is a separate topic altogether).

In other words, sizeof() only works for types and for statically initialized arrays (those that are initialized in code), such as these:

char one[] = { 'a' };
char two[] = "b";  // using the string quotes results in a final zero-byte being automatically added.  So this is an array of 2 bytes.
char three[3] = "c"; // the specified size overrides the string size, so this produces an array of 'c', 0, <uninitialized>
char bad[1] = "d"; // trying to put 2 bytes in a 1 byte-bag. This should generate a compiler error.
unsigned char *raw_ptr;
raw_ptr = (unsigned char *)time_ptr;

This creates a pointer of type unsigned char and it's initialized with a pointer to a struct tm pointer (accomplished via casting).

but how did it manage to become an array (character array, I think)

The time_ptr has not changed. The program is being told to look at the same memory location as time_ptr but consider it as an array of unsigned char types.

I think that this might be to do with the fact that arrays are interpreted as pointers which point to their 0th elements, but I am not sure.

Array types decay to pointers. So yes, arrays are represented by pointers. However, the pointer doesn't have to be associated with the 0th index, but it will be that way when the array is first created.

Secondly, what is the output from the dump_time_struct_bytes function (the dumped bytes)?

Yes. There is no byte type so char or unsigned char is often used.

Also, what does the address of *time_ptr correspond to? Is it the start of the structure?

Yes.

If the latter is true, do the corresponding dumped bytes in the output belong only to its first element (tm_sec) or to the whole structure ?

The whole structure because the second parameter size was initialized using sizeof(struct tm) (ie, all the bytes comprising that type).

The explanation for the "hacky pointer" was a bit weird- why does dereferencing a converted integer pointer solely reveal the contents of the first element in the structure- tm_sec?

It seems that the first data member is tm_sec and it is of type int . Therefore a pointer to struct tm is pointing to the same memory used to store tm_sec . So, the memory location is cast to int* since tm_sec is of type int and and we're dealing with a pointer to it. Then it's dereferenced to see the value of that address (when it's treated/viewed as an int as opposed to struct tm ).


Note: Given an arbitrary 4 bytes. What do the mean? If they are viewed as an unsigned 32-bit integer type a certain value is produced. If they are viewed as a 32-bit floating point type a different value may be produced. Casting is way to force a particular "view" of the bytes regardless of what the bytes really represent.

The pointer struct tm *time_ptr is typecasted to char * , this simply means that the memory to which it is pointing to will now be treated as sequence of 1 byte data. This is the main concept used for the pointer airthmetic, the type of the pointer governs how many bytes will the pointer move when it is incremented. Since this is a char pointer, incrementing it will move it ahead by just a single byte and you can see the memory dump being printed byte by byte.

In the second case, the type of the pointer is (int*) , pointing to the same memory location which will now treat the memory as sequence of sizeof(int) (depend on the platform, the size could vary). In this case it is 4 bytes. Now you can see that 4 bytes group 0x00 00 00 18 is equal to 24 decimal. Similarly 0x00 00 00 16 is equal to 22 in decimal and 0x00 00 00 04 is equal to 4 in decimal. (Take the endianness into account here).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM