简体   繁体   English

为什么GCC初始化为0时不给静态变量赋值

[英]Why does GCC not assign the static variable when it is initialized to 0

I initialize a static variable to 0, but when I see the assembly code, I find that only memory is allocated to the variable.我将一个静态变量初始化为0,但是当我看到汇编代码时,我发现只有内存分配给了该变量。 The value is not assigned未分配值
And when I initialize the static variable to other numbers, I can find that the memory is assigned a value.而当我将静态变量初始化为其他数字时,我可以发现内存被赋值了。
I guess whether GCC thinks the memory should be initialized to 0 by OS before we use the memory.我猜 GCC 是否认为在我们使用内存之前应该由操作系统将内存初始化为 0。

The GCC option I use is "gcc -m32 -fno-stack-protector -c -o"我使用的 GCC 选项是“gcc -m32 -fno-stack-protector -c -o”

When I initialize the static variable to 0, the c code and the assembly code:当我将静态变量初始化为 0 时,c 代码和汇编代码:

static int temp_front=0;
.local  temp_front.1909
.comm   temp_front.1909,4,4

When I initialize it to other numbers, the code is:当我将其初始化为其他数字时,代码为:

static int temp_front=1;
    .align 4
    .type   temp_front.1909, @object
    .size   temp_front.1909, 4
temp_front.1909:
    .long   1

TL:DR: GCC knows the BSS is guaranteed to be zero-initialized on the platform it's targeting so it puts zero-initialized static data there. TL:DR:GCC 知道 BSS 保证在其目标平台上进行零初始化,因此它将零初始化静态数据放在那里。

Big picture大图

The program loader of most modern operating systems gets two different sizes for each part of the program, like the data part.大多数现代操作系统的程序加载器为程序的每个部分(如数据部分)提供两种不同的大小。 The first size it gets is the size of data stored in the executable file (like a PE/COFF .EXE file on Windows or an ELF executable on Linux), while the second size is the size of the data part in memory while the program is running.它获得的第一个大小是存储在可执行文件中的数据大小(如 Windows 上的PE/COFF .EXE文件或 Linux 上的ELF可执行文件),而第二个大小是程序运行时内存中数据部分的大小在跑。

If the data size for the running program is bigger than the amount of data stored in the executable file, the remaining part of the data section is filled with bytes containing zero.如果正在运行的程序的数据大小大于可执行文件中存储的数据量,则数据部分的剩余部分填充有包含零的字节。 In your program, the .comm line tells the linker to reserve 4 bytes without initializing them, so that the OS zero-initializes them on start.在您的程序中, .comm行告诉链接器保留 4 个字节而不初始化它们,以便操作系统在启动时对它们进行零初始化。

What does gcc do? gcc 是做什么的?

gcc (or any other C compiler) allocates zero-initialized variables with static storage duration in the .bss section. gcc (或任何其他 C 编译器)在.bss部分分配具有静态存储持续时间的零初始化变量。 Everything allocated in that section will be zero-initialized on program startup.该部分中分配的所有内容都将在程序启动时进行零初始化。 For allocation, it uses the comm directive, and it just specifies the size (4 bytes).对于分配,它使用comm指令,它只指定大小(4 个字节)。

You can see the size of the main section types (code, data, bss) using the size command.您可以使用 size 命令查看主要部分类型(代码、数据、bss)的大小。 If you initialize the variable with one, it is included in a data section, and occupies 4 bytes there.如果用 1 初始化变量,它包含在data段中,并在那里占用 4 个字节。 If you initialize it with zero (or not at all), it is instead allocated in the .bss section.如果您将其初始化为零(或根本不初始化),则会在 .bss 部分中分配它。

What does ld do? ld 是做什么的?

ld merges all data-type section of all object files (even those from static libraries) into one data section, followed by all .bss -type sections. ld 将所有目标文件(甚至来自静态库的)的所有数据类型部分合并为一个数据部分,然后是所有.bss类型部分。 The executable output contains a simplified view for the operating system's program loader.可执行输出包含操作系统程序加载器的简化视图。 For ELF files, this is the " program header ".对于 ELF 文件,这是“ 程序头”。 You can take a look at it using objdump -p for any format, or readelf for ELF files.您可以对任何格式使用objdump -p或对 ELF 文件使用readelf来查看它。

The program headers contain of entries of different type.程序头包含不同类型的条目。 Among them are a couple of entries with the type PT_LOAD describing the "segments" to be loaded by the operating system.其中有几个类型为PT_LOAD的条目,描述了操作系统要加载的“段”。 One of these PT_LOAD entries is for the data area (where the .data section is linked).这些 PT_LOAD 条目之一用于数据区( .data部分链接的地方)。 It contains an entry called p_filesz that specifies how many bytes for initialized variables are provided in the ELF file, and an entry called p_memsz telling the loader how much space in the address space should be reserved.它包含一个名为p_filesz的条目,它指定在 ELF 文件中为初始化变量提供了多少字节,以及一个名为p_memsz的条目,它告诉加载程序应该在地址空间中保留多少空间。 The details on which sections get merged into what PT_LOAD entries differ between linkers and depend on command line options, but generally you will find a PT_LOAD entry that describes a region that is both readable and writeable, but not executable, and has a p_filesz value that is smaller than the p_memsz entry (potentially zero if there's only a .bss , no .data section).关于哪些部分被合并到哪些 PT_LOAD 条目中的细节在链接器之间有所不同并取决于命令行选项,但通常您会发现一个 PT_LOAD 条目描述了一个区域,该区域既可读又可写,但不可执行,并且具有p_filesz值小于p_memsz条目(如果只有.bss ,没有.data部分,则可能为零)。 p_filesz is the size of all read+write data sections, whereas p_memsz is bigger to also provide space for zero-initialized variables. p_filesz是所有读+写数据节的大小,而p_memsz更大,以便为零初始化变量提供空间。

The amount p_memsz exceeds p_filesz is the sum of all .bss sections linked into the executable. p_memsz超过p_filesz的数量是链接到可执行文件的所有.bss部分的总和。 (The values might be off a bit due to alignment to pages or disk blocks) (由于与页面或磁盘块对齐,这些值可能会有些偏差)

See chapter 5 in the System V ABI specification , especially pages 5-2 and 5-3 for a description of the program header entries.请参阅System V ABI 规范第 5 章,尤其是第 5-2 和 5-3 页以了解程序头条目的说明。

What does the operating system do?操作系统有什么作用?

The Linux kernel (or another ELF-compliant kernel) iterates over all entries in the program header. Linux 内核(或其他符合 ELF 的内核)迭代程序头中的所有条目。 For each entry containing the type PT_LOAD it allocates virtual address space.对于包含类型PT_LOAD每个条目,它分配虚拟地址空间。 It associates the beginning of that address space with the corresponding region in the executable file, and if the space is writeable, it enables copy-on-write.它将该地址空间的开头与可执行文件中的相应区域相关联,如果该空间是可写的,则启用写时复制。

If p_memsz exceeds p_filesz , the kernel arranges the remaining address space to be completely zeroed out.如果p_memsz超过p_filesz ,内核会安排剩余的地址空间完全清零。 So the variable that got allocated in the .bss section by gcc ends up in the "tail" of the read-write PT_LOAD entry in the ELF file, and the kernel provides the zero.因此,gcc 在.bss部分分配的变量最终位于 ELF 文件中读写 PT_LOAD 条目的“尾部”,内核提供零。

Any whole pages that have no backing data can start out copy-on-write mapped to a shared physical page of zeros.任何没有后备数据的整个页面都可以从写时复制开始映射到一个共享的零物理页面。

Why does GCC not assign ...为什么GCC 不分配...

Most modern OSs will automatically zero-initialize the BSS section.大多数现代操作系统会自动对 BSS 部分进行零初始化。

Using such an OS an "uninitialized" variable is identical to a variable that is initialized to zero.使用这样的操作系统,“未初始化”变量与初始化为零的变量相同。

However, there is one difference: The data of uninitialized variables are not stored in the resulting object and executable files;但是,有一个区别:未初始化变量的数据不会存储在结果对象和可执行文件中; the data of initialized variables is.初始化变量的数据是。

This means that "real" zero-initialized variables may lead to a larger file size compared to uninitialized variables.这意味着与未初始化的变量相比,“真实的”零初始化变量可能会导致更大的文件大小。

For this reason the compiler prefers using "uninitialized" variables if variables are really zero-initialized.出于这个原因,如果变量确实是零初始化的,编译器更喜欢使用“未初始化”的变量。

The GCC option I use is ...我使用的 GCC 选项是...

Of course there are also operating systems which do not automatically initialize "uninitialized" memory to zero.当然,也有操作系统不会自动将“未初始化”内存初始化为零。

As far as I remember Windows 95 is an example for this.据我所知,Windows 95 就是一个例子。

If you want to compile for such an operating system, you may use the GCC command line option -fno-zero-initialized-in-bss .如果你想为这样的操作系统编译,你可以使用 GCC 命令行选项-fno-zero-initialized-in-bss This command line option forces GCC to "really" zero-initialize variables that are zero-initialized.此命令行选项强制 GCC“真正”零初始化零初始化的变量。

I just compiled your code with that command line option;我刚刚用那个命令行选项编译了你的代码; the output looks like this:输出如下所示:

    .data
    .align 4
    .type     temp_front, @object
    .size     temp_front, 4
 temp_front:
    .zero  4

There's no point even in Windows 95 to make zero-initialisation in code of every compiled module.即使在 Windows 95 中也没有必要在每个编译模块的代码中进行零初始化。 May be the Win95 program loader (or even MS-DOS) does not initialize the bss section, but the "ctr0" init module (linked in every comppiled C/C++ program, and that will finally call main() or the DllEntry point, can do that directly in a fast operation for the whole BSS section, whose size is already on the program header and that can also be determined in a static preinitialized variable whose value is computed by the linker, and there's no need to change the way each module is compiled with gcc.可能是 Win95 程序加载器(甚至 MS-DOS)没有初始化 bss 部分,而是“ctr0”init 模块(链接在每个编译的 C/C++ 程序中,最终会调用 main() 或 DllEntry 点,可以直接在整个 BSS 部分的快速操作中完成,其大小已经在程序头中,也可以在静态预初始化变量中确定,该变量的值由链接器计算,并且无需更改每个部分的方式模块是用 gcc 编译的。

However there are more difficulties about automatic variables (local variables allocated on the stack): the compiler does not know if the variable will be initialized if its first use is by reference in a call parameter (to a non-inlined function, which may be in another module compiled separately or linked from an external library or DLL), supposed to fill it.然而,自动变量(分配在堆栈上的局部变量)有更多的困难:如果变量第一次使用是通过调用参数中的引用(对非内联函数,这可能是在单独编译或从外部库或 DLL 链接的另一个模块中),应该填充它。

GCC only knows when the variable is explicitly assigned in the function itself, but if it gets used by reference only, GCC can now fircibly preinitialize it to zero to prevent it to keep a sensitive value left on the stack. GCC 只知道该变量何时在函数本身中被显式赋值,但如果它仅通过引用使用,GCC 现在可以将其预初始化为零,以防止它在堆栈中保留敏感值。 In that case this adds some zero-fill code in the compiled function preamble for these local variables, and this helps prevent some data leaks (generally such leak is unlikely when the varaible is a simple type, but when it is a whole structure, many fields may be left in random state by the subcall.在这种情况下,这会在这些局部变量的编译函数前导码中添加一些零填充代码,这有助于防止一些数据泄漏(当变量是简单类型时,这种泄漏通常不太可能,但是当它是一个整体结构时,许多子调用可能会将字段保留在随机状态。

C11 indicates that such code assuming initialization of auto variables has "undefined" behavior. C11 表示这种假设初始化自动变量的代码具有“未定义”行为。 But GCC will help close the security risk: this is allowed by C11 because this forced zeroing is better to leaving random value and both behaviors are conforming to the "undefined" behavior: zero is as well acceptable as a randomly leaked value.但是 GCC 将有助于关闭安全风险:这是 C11 允许的,因为这种强制清零比留下随机值更好,并且两种行为都符合“未定义”行为:零与随机泄漏值一样可以接受。

Some secure functions also avoid leaving senstive data when returning, they explicitly clear the variables they no longer need to avoid expose them after these function return (and notably when they return from a privilege code to an unprivileged one): this is a good practice, but it is independant of the forced initilization of auto variables used by references in subcalls before they were initialized.一些安全函数在返回时也避免留下敏感数据,它们明确清除不再需要的变量以避免在这些函数返回后暴露它们(特别是当它们从特权代码返回到非特权代码时):这是一个很好的做法,但它与子调用中的引用在初始化之前使用的自动变量的强制初始化无关。 And GCC is smart enough to not forcibly initialize these auto varaibles when there's explicit code that assign them an explicit value. GCC 足够聪明,不会在有显式代码为它们分配显式值时强行初始化这些自动变量。 So the impact is minimal.所以影响很小。 This feature may be disabled in GCC for those apps that want microoptimizations in terms of performance, but in both cases this does not add to the BSS size, and the image size just grows by only <0.1% for the Linux kernel only because of the few bytes of code compiled in a few functions that benefit of this security fix.对于那些希望在性能方面进行微优化的应用程序,此功能可能会在 GCC 中被禁用,但在这两种情况下,这都不会增加 BSS 大小,而且 Linux 内核的图像大小仅增长 <0.1% 仅是因为在一些函数中编译的几个字节的代码受益于这个安全修复。

And this has no effect on "uninitialized" static variables, that GCC puts in the BSS section, cleared by the program loader of the OS, or by the small program's crt0 init module.这对“未初始化的”静态变量没有影响,即 GCC 放入 BSS 部分、由操作系统的程序加载器或小程序的 crt0 init 模块清除的静态变量。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 添加初始化的静态变量时,为什么 .bss 大小会减小? - Why .bss size decrease when adding an initialized static variable? 为什么GCC中的-Wunused-variable即使在静态const上也会产生错误? - Why does -Wunused-variable in GCC produce an error even on static const? 静态变量未初始化为零 - Static variable is not initialized to zero 为什么在返回指向局部变量的指针时而不返回局部变量时,gcc会发出警告? - Why does gcc throw a warning when returning a pointer to a local variable and not when returning a local variable? 与链接静态库相比,为什么从源构建时gcc会产生不同的结果? - Why does gcc produce a different result when bulding from source compared to linking a static library? 为什么 gcc 重新排序函数中的局部变量? - Why does gcc reorder the local variable in function? 为什么在返回指向本地静态变量的指针时此程序出现段错误? - Why does this program segfault when returning a pointer to a local static variable? 为什么GCC说变量未被使用? - Why is GCC saying a variable is unused when it is not? 静态变量初始化为错误的值 - Static variable initialized with wrong value 为什么gcc不支持将动态库链接到静态二进制文件 - Why gcc does not support linking dynamical library into static binary
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM