简体   繁体   English

我如何理解该程序的输出?

[英]How can I understand the output of this program?

My book is attempting to familiarize me with concepts such as pointer dereferencing concerning structures and some weird ways of accessing structures. 我的书试图使我熟悉一些概念,例如有关结构的指针取消引用和一些访问结构的怪异方法。 I am a newbie, and find the following confusing about the code below. 我是新手,发现以下代码令人困惑。

#include <stdio.h>
#include <time.h>
void dump_time_struct_bytes(struct tm *time_ptr, int size) {
    int i;
    unsigned char *raw_ptr;

    printf("bytes of struct located at 0x%08x\n", time_ptr);
    raw_ptr = (unsigned char *)time_ptr;
    for (i = 0; i < size; i++)
    {
        printf("%02x ", raw_ptr[i]);
        if (i % 16 == 15) // Print a newline every 16 bytes.
            printf("\n");
    }
    printf("\n");
}
int main() {

    long int seconds_since_epoch;
    struct tm current_time, *time_ptr;
    int hour, minute, second, i, *int_ptr;

    seconds_since_epoch = time(0); // Pass time a null pointer as argument.
    printf("time() - seconds since epoch: %ld\n", seconds_since_epoch);
    time_ptr = &current_time; // Set time_ptr to the address of
                              // the current_time struct.
    localtime_r(&seconds_since_epoch, time_ptr);

    // Three different ways to access struct elements:
    hour = current_time.tm_hour; // Direct access
    minute = time_ptr->tm_min; // Access via pointer
    second = *((int *)time_ptr); // Hacky pointer access
    printf("Current time is: %02d:%02d:%02d\n", hour, minute, second);
    dump_time_struct_bytes(time_ptr, sizeof(struct tm));

    minute = hour = 0; // Clear out minute and hour.

    int_ptr = (int *)time_ptr;
    for (i = 0; i < 3; i++) {
        printf("int_ptr @ 0x%08x : %d\n", int_ptr, *int_ptr);
        int_ptr++; // Adding 1 to int_ptr adds 4 to the address,
    } // since an int is 4 bytes in size.
}

Output: 输出:

time() - seconds since epoch: 1189311744
Current time is: 04:22:24
bytes of struct located at 0xbffff7f0
18 00 00 00 16 00 00 00 04 00 00 00 09 00 00 00
08 00 00 00 6b 00 00 00 00 00 00 00 fb 00 00 00
00 00 00 00 00 00 00 00 28 a0 04 08
int_ptr @ 0xbffff7f0 : 24
int_ptr @ 0xbffff7f4 : 22
int_ptr @ 0xbffff7f8 : 4
  1. i. 一世。 I understand that the author has redeclared *time_ptr as a pointer to unsigned char, but how did it manage to become an array (character array, I think)? 我了解作者已经将* time_ptr重新声明为指向未签名char的指针,但是它如何设法成为一个数组(我认为是字符数组)? I think that this might be to do with the fact that arrays are interpreted as pointers which point to their 0th elements, but I am not sure. 我认为这可能与以下事实有关:数组被解释为指向其第0个元素的指针,但我不确定。

    ii. II。 Secondly, what is the output from the dump_time_struct_bytes function (the dumped bytes)? 其次,dump_time_struct_bytes函数的输出(转储字节)是什么? I understand that thats the bytes from the structure, but I dont know how they are supposed to make up the 4 hours, 22 minutes and 24 seconds stored in it (if this is the case at all). 我知道那是结构中的字节,但是我不知道它们应该如何构成存储在其中的4小时22分24秒(如果完全是这种情况)。 Also, what does the address of *time_ptr correspond to? 另外,* time_ptr的地址对应什么? Is it the start of the structure? 这是结构的开始吗? If the latter is true, do the corresponding dumped bytes in the output belong only to its first element (tm_sec) or to the whole structure ? 如果后者为true,则输出中相应的转储字节仅属于其第一个元素(tm_sec)还是属于整个结构?

  2. The explanation for the "hacky pointer" was a bit weird- why does dereferencing a converted integer pointer solely reveal the contents of the first element in the structure- tm_sec? “ hacky指针”的解释有点奇怪-为什么取消引用转换后的整数指针仅显示结构tm_sec中第一个元素的内容?

Thank you in advance. 先感谢您。

"I understand that the author has redeclared *time_ptr as a pointer to unsigned char, but how did it manage to become an array (character array, I think)?" “我知道作者已经将* time_ptr重新声明为指向未签名char的指针,但是它如何设法成为一个数组(我认为是字符数组)?”

Pointers point to memory. 指针指向内存。 Memory is an array of bytes. 内存是字节数组。 How many bytes a pointer points to depends on the interpretation (type) of the thing pointed at. 指针指向多少字节取决于所指事物的解释(类型)。 Beyond that simple fact the compiler doesn't do bounds checking in C/C++. 除了这个简单的事实之外,编译器不会在C / C ++中进行边界检查。 So in essence EVERY pointer is a pointer to an array of elements of the type the pointer points at. 因此,从本质上讲,每个指针都是指向该指针所指向类型的元素数组的指针。 So pointer to unsigned char is a pointer to an array of single byte chars. 因此,指向无符号字符的指针是指向单字节字符数组的指针。 A pointer to a structure is a pointer to an array of elements that are each as long as one structure's size. 指向结构的指针是指向元素数组的指针,每个元素数组的长度与一个结构的大小一样长。

So a pointer to a single structure IS a pointer to an array of size 1. Nothing in the language prevents the code from being bad and trying to access an element at the next location. 因此,一个指针到一个单一的结构一个指向尺寸1.在语言没什么的阵列防止代码被坏,并试图在下一位置来访问的元素。

This is both the power and curse of pointers. 这既是指针的力量,也是诅咒。 And the source of many bugs and security problems in C/C++. 并且是C / C ++中许多错误和安全问题的根源。 It's also why you can do a lot of cool things in the language efficiently. 这也是为什么您可以使用该语言高效地完成很多很酷的事情的原因。

"With great power comes great responsibility." “拥有权利的同时也被赋予了重大的责任。”

So this code interprets the struct pointer first as an array of bytes and prints the hex dump, then as an array of integers. 因此,此代码首先将struct指针解释为字节数组,然后输出十六进制转储,然后将其解释为整数数组。 When processing the pointer as a int*, the single increment operation moves by 4 bytes. 当将指针作为int *处理时,单次递增操作将移动4个字节。

Hence the first element is 0x00000018 (little endian for the 4 bytes: 18 00 00 00). 因此,第一个元素是0x00000018(4个字节的小尾数:18 00 00 00)。 0x18 hex is 24. 0x18十六进制为24。

The second integer is 0x00000016 (little endian for 16 00 00 00) = 22. 第二个整数是0x00000016(16 00 00 00的小尾数)= 22。

Etc. 等等。

Note that the int* moves by 4 bytes because in your particular compiler, sizeof(int) == 4 . 请注意,int *移动了4个字节,因为在您的特定编译器中, sizeof(int) == 4 "int" is a special type and can change size based on your compiler. “ int”是一种特殊类型,可以根据您的编译器更改大小。 If you had a different compiler (say for an embedded micro controller), then sizeof(int) might be 2 and the integers would print out as 24, 0, 22 (assuming the exact same memory block). 如果您使用其他编译器(例如嵌入式微控制器),则sizeof(int)可能为2,并且整数将输出为24、0、22(假定完全相同的内存块)。

Is the size of C "int" 2 bytes or 4 bytes? C的大小是“ int” 2字节还是4字节?

=== in response to comment === ===回应评论===

"(Accidentally commented somewhere else) Thank you for your answer. However, there is one thing which seems a bit unclear. Let's say I have a pointer to a char 'c'. Is the pointer now a pointer to a char array of size 1? “(在其他地方偶然评论了)谢谢您的回答。但是,有一件事似乎还不清楚。假设我有一个指向char'c'的指针。该指针现在是一个指向大小为char的数组的指针吗? 1?

YES. 是。 A byte array of one byte. 一个字节的字节数组。

Also, just to verify, you've mentioned that a pointer to a single structure is a pointer to an array of size one. 另外,为验证起见,您已提到指向单个结构的指针是指向大小为1的数组的指针。

YES, but in this case the size of a single element in the array is sizeof(mystruct) , which is likely more than a single byte. 是的,但是在这种情况下,数组中单个元素的大小为sizeof(mystruct) ,它可能大于单个字节。

Typecasting that pointer to a pointer to char will therefore result in the array size now being larger than 1 and being an array of bytes, responsible for the hex dump. 因此,将指针类型转换为指向char的指针将导致数组大小现在大于1,并且是一个字节数组,负责十六进制转储。

YES. 是。

Hence should any pointer when typecasted in such a manner result in this nice byte breakup? 因此,以这种方式进行类型转换时,是否有任何指针会导致这种字节破坏?

YES. 是。 This is how byte/memory dumps work. 这就是字节/内存转储的工作方式。

One more thing about the sizeof(type) keyword. 关于sizeof(type)关键字的另一件事。 sizeof(type) reports the size (in bytes) of an instance of type . sizeof(type)报告type实例的大小(以字节为单位)。 sizeof(variable) is equivalent to the sizeof(type-of-variable). sizeof(variable)等同于sizeof(variable)类型)。 This has a subtle behavior when variable is a pointer or array. 当变量是指针或数组时,这具有微妙的行为。 For example: 例如:

char c = '0'   // in memory this is the single byte 0x30
char str[] = { 0x31, 0x32, 0x00 }; // an array of bytes 0x31, 0x32, 0x00

sizeof(char) == sizeof(c) == 1
sizeof(str) == 3 // compiler knows the array was initialized to 3 bytes
sizeof(p) == 4 // assuming your compiler is using 32-bit pointers.  On a 64-bit machine this would be 8.

char* p = &c;  //  note that assigning a pointer to the address of a variable requires the address-of operator (&)

sizeof(*p) == 1 // this is the size of the thing pointed to.

p = str; // note that assigning an ARRAY variable name to a pointer does not require address-of (because the name of an array IS a pointer - they *are* the same type in all ways except with respect to sizeof() where sizeof() knows the size of an initialized array.)

sizeof (*p) == 1; // even though p was assigned to str - an array - sizeof still returns the answer based on the type of the thing p is pointing to - in this case a single char.  This is subtle but important.  p points to a single character in the array.

// Thus at this point, p points to 0x31.
p++; // p advances in memory by sizeof(*p), now points at 0x32.
p++; // p advances in memory by sizeof(*p), now points at 0x00.
p++; // p advances in memory by sizeof(*p), now points BEYOND THE ARRAY.

IMPORTANT - Because the pointer was advanced past the end of the array, at this point p points to possibly invalid memory OR it might point to some other random variable in memory. 重要说明-由于指针已超前数组末尾,因此p可能指向无效的内存,或者指向内存中的其他一些随机变量。 This can result in a crash (in the case of invalid memory), or a bug and memory corruption (and a likely security bug) if it points to "valid" memory that isn't being used as expected. 如果它指向未按预期使用的“有效”内存,则可能导致崩溃(在无效内存的情况下),错误和内存损坏(以及可能的安全错误)。 In this specific case where the variables are assumed to live on the stack it points to a variable or perhaps the return address of the function. 在这种特定的情况下,假设变量存在于堆栈中,它指向变量或函数的返回地址。 Either way, going beyond the array is BAD. 无论哪种方式,超出阵列的都是BAD。 VERY VERY BAD. 非常非常糟糕。 and the COMPILER WON'T STOP YOU!!! 并且编译器不会阻止您!

Also, by the way - sizeof is NOT a function. 另外,顺便说一句sizeof不是函数。 It is evaluated by the compiler at compile time based on the compiler's symbol table. 它由编译器在编译时根据编译器的符号表进行评估。 Therefore there is no way to get the size of an array allocated like this: 因此,无法获得这样分配的数组大小:

char* p = malloc(sizeof(char)*100);

The compiler doesn't realize that you're allocating 100 bytes because malloc is a run-time function. 编译器没有意识到您要分配100个字节,因为malloc是运行时函数。 (indeed, 100 is usually a variable with a changing value). (实际上,100通常是一个值不断变化的变量)。 Therefore sizeof(p) will return the sizeof a pointer (either 4 or 8 as mentioned earlier), and sizeof(*p) will return sizeof(char) , which is 1. In a case like this the code has to remember how much memory was allocated in a separate variable (or in some other way - dynamic allocation is a separate topic altogether). 因此sizeof(p)将返回一个指针的大小(如前所述,为4或8),而sizeof(*p)将返回为1的sizeof(char) 。在这种情况下,代码必须记住多少内存是通过单独的变量分配的(或以其他方式-动态分配完全是单独的主题)。

In other words, sizeof() only works for types and for statically initialized arrays (those that are initialized in code), such as these: 换句话说,sizeof()仅适用于类型和静态初始化的数组(在代码中初始化的数组),例如:

char one[] = { 'a' };
char two[] = "b";  // using the string quotes results in a final zero-byte being automatically added.  So this is an array of 2 bytes.
char three[3] = "c"; // the specified size overrides the string size, so this produces an array of 'c', 0, <uninitialized>
char bad[1] = "d"; // trying to put 2 bytes in a 1 byte-bag. This should generate a compiler error.
unsigned char *raw_ptr;
raw_ptr = (unsigned char *)time_ptr;

This creates a pointer of type unsigned char and it's initialized with a pointer to a struct tm pointer (accomplished via casting). 这将创建一个类型为unsigned char的指针,并使用指向struct tm指针的指针进行初始化(通过强制转换完成)​​。

but how did it manage to become an array (character array, I think) 但是它如何设法成为一个数组(我认为是字符数组)

The time_ptr has not changed. time_ptr The program is being told to look at the same memory location as time_ptr but consider it as an array of unsigned char types. 程序被告知要查看与time_ptr相同的内存位置,但将其视为unsigned char类型数组。

I think that this might be to do with the fact that arrays are interpreted as pointers which point to their 0th elements, but I am not sure. 我认为这可能与以下事实有关:数组被解释为指向其第0个元素的指针,但我不确定。

Array types decay to pointers. 数组类型衰减到指针。 So yes, arrays are represented by pointers. 是的,数组由指针表示。 However, the pointer doesn't have to be associated with the 0th index, but it will be that way when the array is first created. 但是,指针不必与第0个索引相关联,但是在首次创建数组时将采用这种方式。

Secondly, what is the output from the dump_time_struct_bytes function (the dumped bytes)? 其次,dump_time_struct_bytes函数的输出(转储字节)是什么?

Yes. 是。 There is no byte type so char or unsigned char is often used. 没有byte类型,因此经常使用charunsigned char

Also, what does the address of *time_ptr correspond to? 另外,* time_ptr的地址对应什么? Is it the start of the structure? 这是结构的开始吗?

Yes. 是。

If the latter is true, do the corresponding dumped bytes in the output belong only to its first element (tm_sec) or to the whole structure ? 如果后者为true,则输出中相应的转储字节仅属于其第一个元素(tm_sec)还是属于整个结构?

The whole structure because the second parameter size was initialized using sizeof(struct tm) (ie, all the bytes comprising that type). 整个结构,因为第二个参数的size是使用sizeof(struct tm)初始化的(即,包括该类型的所有字节)。

The explanation for the "hacky pointer" was a bit weird- why does dereferencing a converted integer pointer solely reveal the contents of the first element in the structure- tm_sec? “ hacky指针”的解释有点奇怪-为什么取消引用转换后的整数指针仅显示结构tm_sec中第一个元素的内容?

It seems that the first data member is tm_sec and it is of type int . 似乎第一个数据成员是tm_sec ,它的类型是int Therefore a pointer to struct tm is pointing to the same memory used to store tm_sec . 因此,指向struct tm的指针指向用于存储tm_sec的相同内存。 So, the memory location is cast to int* since tm_sec is of type int and and we're dealing with a pointer to it. 因此,存储位置被转换为int*因为tm_sec的类型为int和我们正在处理的一个指针。 Then it's dereferenced to see the value of that address (when it's treated/viewed as an int as opposed to struct tm ). 然后它被取消引用以查看该地址的值(当将它视为/被视为int而不是struct tm )。


Note: Given an arbitrary 4 bytes. 注意:给定任意4个字节。 What do the mean? 什么意思 If they are viewed as an unsigned 32-bit integer type a certain value is produced. 如果将它们视为无符号的32位整数类型,则会生成某个值。 If they are viewed as a 32-bit floating point type a different value may be produced. 如果将它们视为32位浮点类型,则可能会产生不同的值。 Casting is way to force a particular "view" of the bytes regardless of what the bytes really represent. 强制转换是强制字节的特定“视图”的方法,而不管字节真正代表什么。

The pointer struct tm *time_ptr is typecasted to char * , this simply means that the memory to which it is pointing to will now be treated as sequence of 1 byte data. 指针struct tm *time_ptr被强制转换为char * ,这意味着它所指向的内存现在将被视为1字节数据序列。 This is the main concept used for the pointer airthmetic, the type of the pointer governs how many bytes will the pointer move when it is incremented. 这是用于指针节流的主要概念,指针的类型决定了指针递增时将移动多少字节。 Since this is a char pointer, incrementing it will move it ahead by just a single byte and you can see the memory dump being printed byte by byte. 由于这是一个char指针,因此将其递增将仅向前移动一个字节,并且您可以看到正在逐字节打印内存转储。

In the second case, the type of the pointer is (int*) , pointing to the same memory location which will now treat the memory as sequence of sizeof(int) (depend on the platform, the size could vary). 在第二种情况下,指针的类型为(int*) ,指向相同的内存位置,该位置现在会将内存视为sizeof(int)序列(取决于平台,大小可能有所不同)。 In this case it is 4 bytes. 在这种情况下,它是4个字节。 Now you can see that 4 bytes group 0x00 00 00 18 is equal to 24 decimal. 现在您可以看到4字节组0x00 00 00 18等于24个十进制。 Similarly 0x00 00 00 16 is equal to 22 in decimal and 0x00 00 00 04 is equal to 4 in decimal. 同样,0x00 00 00 16等于十进制的22,0x00 00 00 04等于十进制的4。 (Take the endianness into account here). (在这里要考虑字节序)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM