简体   繁体   中英

Memory allocation and unused byte in basic C program

I have a question regarding memory allocation on the basic C program.

#include <stdio.h>

int main()
{
    int  iarray[3];
    char carray[3];
    
    printf("%p\n", &iarray); // b8
    printf("%p\n", &iarray+1); // c4
    printf("%p\n", &carray); // c5
    
    return 0;
}

given the code above, you can see that &iarray+1 and &carray have a difference of one byte which I'm not sure for which purpose or what in it, why does the compiler assign an unused byte between the two arrays?

I thought maybe it uses to know the array size, but I understood that sizeof is a compile-time function that knows the size without allocation of real memory, so there no use for storing the array size

Note: The output can be seen on the comments of each printf .

b8
c4
c5

Playground: https://onlinegdb.com/cTdzccpDvI

Thanks.

Compilers are free to arrange variables in memory any way that they see fit. Typically, they will be placed at memory offsets whose value is a multiple of the variable's size, for example a 4 byte int or int array will start at an address which is a multiple of 4.

In this case, you have an int array starting at an address which is a multiple of 4, followed by an unused byte, followed by a char array of size 3. In theory, an int or long could immediately follow the char array in memory if it was defined as the next available address is a multiple of 8.

From your output it looks like the stack is arranged like this for these local variables:

b8-bb: 1st integer of iarray
bc-bf: 2nd integer of iarray
c0-c3: 3rd integer of iarray
c4:    padding probably, only compiler knows
c5-c7: carray

Now when you do &iarray+1 You are taking the address of an array int[3] , and adding +1 of that array type to it. In other words, you are getting the address of the next int[3] array, which indeed would be at c4 (but isn't because there's just one int[3] ).

This code is actually valid. You must not dereference this pointer, but because it points exactly +1 past the iarray , having the pointer and printing its value is legal (in other words, not Undefined Behavior, like &iarray+2 would be).

If you also print this:

printf("%p\n", iarray+1);

You should get result bc , because now you take pointer of type int ( iarray is treated as pointer to int ), add 1 to that, getting the next int .

This behavior is purely (compiler) implementation defined. What probably happens is this:

When a function ( main() in this case) is invoked which has local variables, memory for those variables are allocated on the stack. In this case, 15 bytes are needed, but it is likely that 4-byte alignment is required for the stack allocation, so that 16 bytes are allocated.

It is also likely that the int -array must be 4-byte aligned. Hence the address of the int array is a multiple of 4. The char -array does not have any alignment requirements so it can be placed anywhere in the 4 remaining bytes.

So in short, the additional byte is unused, but allocated due to alignment.

The reason this happens is the particular compiler you are using allocates memory from high addresses to low in this particular situation.

The compiler analyzes the main routine and see it needs 3 int for iarray and 3 char for carray . For whatever reason, it decides to work on carray first.

The compiler starts with a planned stack frame that is required to have 16-byte alignment at certain points. Additional data is needed on the stack with the result that the point where the compiler starts putting local variables is at an address that is 8 modulo 16 (so its hexadecimal representation ends in 8). That is, from some address like 7FFC565E90A8 and up, memory is used for managing the stack frame. The first bytes for local objects will be at 7FFC565E90A7, 7FFC565E90A6, 7FFC565E90A5, 7FFC565E90A4, and so on.

The compiler takes the first three bytes of that space for carray . Recall we are working from high addresses to low addresses. (For historical reasons, that is the direction that stacks grow; some high address is assigned as the starting point, and new data is put in lower addresses.) So carray is put at address 7FFC565E90A5. It fills the bytes at 7FFC565E90A5, 7FFC565E90A6, and 7FFC565E90A7.

Then the compiler needs to assign 12 bytes for iarray . The next available 12 bytes are from 7FFC565E9099 to 7FFC565E90A4. However, the int elements in iarray require 4-byte alignment, so they cannot start at 7FFC565E9099. Therefore, the compiler adjusts to have them start at 7FFC565E9098. Then iarray fills bytes from 7FFC565E9098 to 7FFC565E90A3, and 7FFC565E90A4 is unused.

Note that in other situations, the compiler may arrange local objects in different ways. When you have multiple objects with different alignments, the compiler may choose to cluster all objects with the same alignment to reduce the number of places it needs to insert padding. A compiler could also choose to allocate memory for objects in alphabetical order by their names. Or it could do it in the order it happens to store them in its hash table. Or some combination of these things, such as clustering all objects by alignment requirement but then sorting by name within each cluster.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM