简体   繁体   中英

Why do C and C++ compilers place explicitly initialized and default initialized global variables in different segments?

I was reading this great post about memory layout of C programs . It says that default initialized global variables resides in the BSS segment , and if you explicitly provide a value to a global variable then it will reside in the data segment .

I've tested the following programs in C and C++ to examine this behaviour.

#include <iostream>
// Both i and s are having static storage duration
int i;     // i will be kept in the BSS segment, default initialized variable, default value=0
int s(5);  // s will be kept in the data segment, explicitly initialized variable,
int main()
{
    std::cout<<&i<<' '<<&s;
}

Output:

0x488020 0x478004

So, from the output it clearly looks like both variable i & s resides in completely different segments. But if I remove the initializer (initial value 5 in this program) from the variable S and then run the program, it gives me the below output.

Output:

0x488020 0x488024

So, from the output it clearly looks like both variables i and s resides in the same (in this case BSS) segment.

This behaviour is also the same in C.

#include <stdio.h>
int i;      // i will be kept in the BSS segment, default initialized variable, default value=0
int s=5;    // s will be kept in the data segment, explicitly initialized variable,
int main(void)
{
    printf("%p %p\n",(void*)&i,(void*)&s);
}

Output:

004053D0 00403004

So, again we can say by looking at the output (means examining the address of variables), both variable i and s resides in completely different segments. But again if I remove the initializer (initial value 5 in this program) from the variable S and then run the program it gives me the below output.

Output:

004053D0 004053D4

So, from the output it clearly looks like both variables i and s resides in the same (in this case BSS) segment.

Why do C and C++ compilers place explicitly initialized and default initialized global variables in different segments? Why is there a distinction about where the global variable resides between default initialized and explicitly initialized variables? If I am not wrong, the C and C++ standards never talk about the stack, heap, data segment, code segment, BSS segment and all such things which are implementation-specific. So, is it possible for a C++ implementation to store explicitly initialized and default initialized variables in the same segments instead of keeping it in different segments?

Neither language C or C++ has any notion of "segments", and not all OSs do either, so your question is inevitably dependent on the platform and compiler.

That said, common implementations will treat initialized vs. uninitialized variables differently. The main difference is that uninitialized (or default 0-initialized) data does not have to be actually saved with the compiled module, but only declared/reserved for later use at run time. In practical "segment" terms, initialized data is saved to disk as part of the binary, while uninitialized data is not , instead it's allocated at startup to satisfy the declared "reservations".

The really short answer is "because it takes up less space". (As noted by others, the compiler doesn't have to do this!)

In the executable file, the data section will contain data that has its value store in the relative place. This means for every byte of initialized data, that data section contains one byte.

For zero-initialized globals, there is no reason to store a lot of zeros. Instead, just store the size of the whole set of data in one single size-value. So instead of storing 4132 bytes of zero in the data seciton, there is just a "BSS is 4132 bytes long" - and it's up to the OS/runtime to set up so that it is zero. - in some cases, the runtime of the compiler will memset(BSSStart, 0, BSSSize) or similar. In for example Linux, all "unused" memory is filled with zero anyway when the process is created, so setting BSS to zero is just a matter of allocating the memory in the first place.

And of course, shorter executable files have several benefits: Less space taken up on your hard-disk, faster loading time [extra bonus if the OS pre-fills the allocated memory with zero], faster compile time as the compiler/linker doesn't have to write the data to disk.

So there is an entirely practical reason for this.

By definition, BSS is not a different segment, it is a part of data-segment.

In C and C++, statically-allocated objects without an explicit initializer are initialized to zero, an implementation may also assign statically-allocated variables and constants initialized with a value consisting solely of zero-valued bits to the BSS section .

A reason to store them in BSS is, those types of variables with uninitialized or default values can be obtained in run-time without wasting space in the binary files rather than the variables which are placed in data-segment.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM