简体   繁体   English

Memory 分配和基本 C 程序中未使用的字节

[英]Memory allocation and unused byte in basic C program

I have a question regarding memory allocation on the basic C program.我对基本C程序上的 memory 分配有疑问。

#include <stdio.h>

int main()
{
    int  iarray[3];
    char carray[3];
    
    printf("%p\n", &iarray); // b8
    printf("%p\n", &iarray+1); // c4
    printf("%p\n", &carray); // c5
    
    return 0;
}

given the code above, you can see that &iarray+1 and &carray have a difference of one byte which I'm not sure for which purpose or what in it, why does the compiler assign an unused byte between the two arrays?鉴于上面的代码,您可以看到&iarray+1&carray有一个字节的差异,我不确定它的用途或内容,为什么编译器会在两个 arrays 之间分配一个未使用的字节?

I thought maybe it uses to know the array size, but I understood that sizeof is a compile-time function that knows the size without allocation of real memory, so there no use for storing the array size我想也许它用来知道数组大小,但我知道sizeof是一个编译时 function 知道大小而不分配真正的 memory,所以没有用存储数组大小

Note: The output can be seen on the comments of each printf .注: output 可以在每个printf的评论中看到。

b8
c4
c5

Playground: https://onlinegdb.com/cTdzccpDvI游乐场: https://onlinegdb.com/cTdzccpDvI

Thanks.谢谢。

Compilers are free to arrange variables in memory any way that they see fit.编译器可以以任何他们认为合适的方式自由排列 memory 中的变量。 Typically, they will be placed at memory offsets whose value is a multiple of the variable's size, for example a 4 byte int or int array will start at an address which is a multiple of 4.通常,它们将放置在 memory 偏移处,其值是变量大小的倍数,例如,4 字节intint数组将从 4 的倍数的地址开始。

In this case, you have an int array starting at an address which is a multiple of 4, followed by an unused byte, followed by a char array of size 3. In theory, an int or long could immediately follow the char array in memory if it was defined as the next available address is a multiple of 8.在这种情况下,您有一个int数组,起始地址是 4 的倍数,后跟一个未使用的字节,然后是一个大小为 3 的char数组。理论上,一个intlong可以紧跟 memory 中的char数组如果它被定义为下一个可用地址是 8 的倍数。

From your output it looks like the stack is arranged like this for these local variables:从您的 output 看来,这些局部变量的堆栈排列方式如下:

b8-bb: 1st integer of iarray
bc-bf: 2nd integer of iarray
c0-c3: 3rd integer of iarray
c4:    padding probably, only compiler knows
c5-c7: carray

Now when you do &iarray+1 You are taking the address of an array int[3] , and adding +1 of that array type to it.现在,当您执行&iarray+1时,您将获取数组int[3]的地址,并将该数组类型的 +1 添加到其中。 In other words, you are getting the address of the next int[3] array, which indeed would be at c4 (but isn't because there's just one int[3] ).换句话说,您将获得下一个int[3]数组的地址,该数组确实位于 c4 (但不是因为只有一个int[3] )。

This code is actually valid.这段代码实际上是有效的。 You must not dereference this pointer, but because it points exactly +1 past the iarray , having the pointer and printing its value is legal (in other words, not Undefined Behavior, like &iarray+2 would be).您不能取消引用此指针,但因为它正好指向iarray之后的 +1,所以拥有指针并打印其值是合法的(换句话说,不是未定义的行为,就像&iarray+2那样)。

If you also print this:如果你也打印这个:

printf("%p\n", iarray+1);

You should get result bc , because now you take pointer of type int ( iarray is treated as pointer to int ), add 1 to that, getting the next int .您应该得到结果bc ,因为现在您采用int类型的指针( iarray被视为指向int的指针),将其加 1 ,得到下一个int

This behavior is purely (compiler) implementation defined.此行为纯粹是(编译器)实现定义的。 What probably happens is this:可能发生的事情是这样的:

When a function ( main() in this case) is invoked which has local variables, memory for those variables are allocated on the stack.当调用具有局部变量的 function (在这种情况下为main()时,这些变量的 memory 将在堆栈上分配。 In this case, 15 bytes are needed, but it is likely that 4-byte alignment is required for the stack allocation, so that 16 bytes are allocated.在这种情况下,需要 15 个字节,但很有可能堆栈分配需要 4 个字节的 alignment,因此分配了 16 个字节。

It is also likely that the int -array must be 4-byte aligned. int -array 也可能必须是 4 字节对齐的。 Hence the address of the int array is a multiple of 4. The char -array does not have any alignment requirements so it can be placed anywhere in the 4 remaining bytes.因此int数组的地址是 4 的倍数。 char数组没有任何 alignment 要求,因此它可以放置在剩余 4 个字节中的任何位置。

So in short, the additional byte is unused, but allocated due to alignment.所以简而言之,额外的字节是未使用的,而是由于 alignment 分配的。

The reason this happens is the particular compiler you are using allocates memory from high addresses to low in this particular situation.发生这种情况的原因是您使用的特定编译器在这种特殊情况下将 memory 从高地址分配到低地址。

The compiler analyzes the main routine and see it needs 3 int for iarray and 3 char for carray .编译器分析main例程,发现carray需要 3 个int ,而iarray需要 3 个char For whatever reason, it decides to work on carray first.无论出于何种原因,它决定先处理carray

The compiler starts with a planned stack frame that is required to have 16-byte alignment at certain points.编译器从计划的堆栈帧开始,该堆栈帧需要在某些点具有 16 字节 alignment。 Additional data is needed on the stack with the result that the point where the compiler starts putting local variables is at an address that is 8 modulo 16 (so its hexadecimal representation ends in 8).堆栈上需要额外的数据,结果编译器开始放置局部变量的点位于 8 模 16 的地址(因此其十六进制表示以 8 结尾)。 That is, from some address like 7FFC565E90A8 and up, memory is used for managing the stack frame.也就是说,从 7FFC565E90A8 及以上的某个地址开始,memory 用于管理堆栈帧。 The first bytes for local objects will be at 7FFC565E90A7, 7FFC565E90A6, 7FFC565E90A5, 7FFC565E90A4, and so on.本地对象的第一个字节将位于 7FFC565E90A7、7FFC565E90A6、7FFC565E90A5、7FFC565E90A4 等处。

The compiler takes the first three bytes of that space for carray .编译器将该空间的前三个字节用于carray Recall we are working from high addresses to low addresses.回想一下,我们是从高地址到低地址工作的。 (For historical reasons, that is the direction that stacks grow; some high address is assigned as the starting point, and new data is put in lower addresses.) So carray is put at address 7FFC565E90A5. (由于历史原因,这是堆栈增长的方向;分配一些高地址作为起点,新数据放入低地址。)因此将carray放入地址7FFC565E90A5。 It fills the bytes at 7FFC565E90A5, 7FFC565E90A6, and 7FFC565E90A7.它填充 7FFC565E90A5、7FFC565E90A6 和 7FFC565E90A7 处的字节。

Then the compiler needs to assign 12 bytes for iarray .然后编译器需要为iarray分配 12 个字节。 The next available 12 bytes are from 7FFC565E9099 to 7FFC565E90A4.下一个可用的 12 个字节是从 7FFC565E9099 到 7FFC565E90A4。 However, the int elements in iarray require 4-byte alignment, so they cannot start at 7FFC565E9099.但是 iarray 中的int元素需要 4 字节的iarray ,所以不能从 7FFC565E9099 开始。 Therefore, the compiler adjusts to have them start at 7FFC565E9098.因此,编译器进行调整以使它们从 7FFC565E9098 开始。 Then iarray fills bytes from 7FFC565E9098 to 7FFC565E90A3, and 7FFC565E90A4 is unused.然后iarray填充从 7FFC565E9098 到 7FFC565E90A3 的字节,而 7FFC565E90A4 未使用。

Note that in other situations, the compiler may arrange local objects in different ways.请注意,在其他情况下,编译器可能会以不同的方式排列本地对象。 When you have multiple objects with different alignments, the compiler may choose to cluster all objects with the same alignment to reduce the number of places it needs to insert padding.当您有多个具有不同对齐方式的对象时,编译器可能会选择将所有具有相同 alignment 的对象聚集在一起,以减少需要插入填充的位置数量。 A compiler could also choose to allocate memory for objects in alphabetical order by their names.编译器还可以选择按名称的字母顺序为对象分配 memory。 Or it could do it in the order it happens to store them in its hash table.或者它可以按照恰好将它们存储在其 hash 表中的顺序来执行此操作。 Or some combination of these things, such as clustering all objects by alignment requirement but then sorting by name within each cluster.或者这些东西的某种组合,例如按照 alignment 要求对所有对象进行聚类,然后在每个聚类中按名称排序。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM