简体   繁体   English

在 C 中声明的、未初始化的变量会发生什么? 它有价值吗?

[英]What happens to a declared, uninitialized variable in C? Does it have a value?

If in CI write:如果在 CI 中写:

int num;

Before I assign anything to num , is the value of num indeterminate?我给你什么前num ,是价值num不确定?

Static variables (file scope and function static) are initialized to zero:静态变量(文件作用域和函数静态)被初始化为零:

int x; // zero
int y = 0; // also zero

void foo() {
    static int x; // also zero
}

Non-static variables (local variables) are indeterminate .非静态变量(局部变量)是不确定的 Reading them prior to assigning a value results in undefined behavior .在赋值之前读取它们会导致未定义的行为

void foo() {
    int x;
    printf("%d", x); // the compiler is free to crash here
}

In practice, they tend to just have some nonsensical value in there initially - some compilers may even put in specific, fixed values to make it obvious when looking in a debugger - but strictly speaking, the compiler is free to do anything from crashing to summoning demons through your nasal passages .在实践中,它们最初往往只有一些无意义的值——一些编译器甚至可能会放入特定的、固定的值,以便在调试器中查看时显而易见——但严格来说,编译器可以自由地做任何事情,从崩溃到召唤恶魔通过你的鼻腔

As for why it's undefined behavior instead of simply "undefined/arbitrary value", there are a number of CPU architectures that have additional flag bits in their representation for various types.至于为什么它是未定义的行为而不是简单的“未定义/任意值”,有许多 CPU 体系结构在其各种类型的表示中具有额外的标志位。 A modern example would bethe Itanium, which has a "Not a Thing" bit in its registers ;一个现代的例子是Itanium,它的寄存器中有一个“Not a Thing”位 of course, the C standard drafters were considering some older architectures.当然,C 标准起草者正在考虑一些较旧的架构。

Attempting to work with a value with these flag bits set can result in a CPU exception in an operation that really shouldn't fail (eg, integer addition, or assigning to another variable).尝试使用设置了这些标志位的值可能会导致 CPU 异常,该操作实际上不应失败(例如,整数加法或分配给另一个变量)。 And if you go and leave a variable uninitialized, the compiler might pick up some random garbage with these flag bits set - meaning touching that uninitialized variable may be deadly.如果你让一个变量未初始化,编译器可能会在设置了这些标志位的情况下捡起一些随机垃圾——这意味着触及未初始化的变量可能是致命的。

0 if static or global, indeterminate if storage class is auto 0 如果静态或全局,不确定存储类是否为自动

C has always been very specific about the initial values of objects. C 对对象的初始值一直非常具体。 If global or static , they will be zeroed.如果 global 或static ,它们将被归零。 If auto , the value is indeterminate .如果为auto ,则该值不确定

This was the case in pre-C89 compilers and was so specified by K&R and in DMR's original C report.在 C89 之前的编译器中就是这种情况,K&R 和 DMR 的原始 C 报告中也如此指定。

This was the case in C89, see section 6.5.7 Initialization .在 C89 中就是这种情况,请参阅第6.5.7初始化

If an object that has automatic storage duration is not initialized explicitely, its value is indeterminate.如果一个具有自动存储期的对象没有明确初始化,它的值是不确定的。 If an object that has static storage duration is not initialized explicitely, it is initialized implicitely as if every member that has arithmetic type were assigned 0 and every member that has pointer type were assigned a null pointer constant.如果没有显式初始化具有静态存储持续时间的对象,则它会被隐式初始化,就好像每个具有算术类型的成员都被分配了 0,并且每个具有指针类型的成员都被分配了一个空指针常量。

This was the case in C99, see section 6.7.8 Initialization .在 C99 中就是这种情况,请参阅第6.7.8初始化

If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate.如果没有显式初始化具有自动存储持续时间的对象,则其值是不确定的。 If an object that has static storage duration is not initialized explicitly, then:如果没有显式初始化具有静态存储持续时间的对象,则:
— if it has pointer type, it is initialized to a null pointer; ——如果是指针类型,则初始化为空指针;
— if it has arithmetic type, it is initialized to (positive or unsigned) zero; ——如果它有算术类型,它被初始化为(正或无符号)零;
— if it is an aggregate, every member is initialized (recursively) according to these rules; — 如果是聚合,则根据这些规则(递归地)初始化每个成员;
— if it is a union, the first named member is initialized (recursively) according to these rules. — 如果是联合,则根据这些规则(递归地)初始化第一个命名成员。

As to what exactly indeterminate means, I'm not sure for C89, C99 says:至于究竟不确定是什么意思,我不确定 C89,C99 说:

3.17.2 3.17.2
indeterminate value不确定值
either an unspecified value or a trap representation未指定的值或陷阱表示

But regardless of what standards say, in real life, each stack page actually does start off as zero, but when your program looks at any auto storage class values, it sees whatever was left behind by your own program when it last used those stack addresses.但是不管标准怎么说,在现实生活中,每个堆栈页实际上都是从零开始的,但是当您的程序查看任何auto存储类值时,它会看到您自己的程序上次使用这些堆栈地址时留下的任何内容. If you allocate a lot of auto arrays you will see them eventually start neatly with zeroes.如果你分配了很多auto数组,你会看到它们最终整齐地以零开始。

You might wonder, why is it this way?你可能想知道,为什么会这样? A different SO answer deals with that question, see: https://stackoverflow.com/a/2091505/140740一个不同的 SO 答案处理这个问题,请参阅: https : //stackoverflow.com/a/2091505/140740

It depends on the storage duration of the variable.这取决于变量的存储持续时间。 A variable with static storage duration is always implicitly initialized with zero.具有静态存储持续时间的变量总是隐式初始化为零。

As for automatic (local) variables, an uninitialized variable has indeterminate value .至于自动(局部)变量,未初始化的变量具有不确定的值 Indeterminate value, among other things, mean that whatever "value" you might "see" in that variable is not only unpredictable, it is not even guaranteed to be stable .不确定的值,除其他外,意味着您可能在该变量中“看到”的任何“值”不仅是不可预测的,甚至不能保证是稳定的 For example, in practice (ie ignoring the UB for a second) this code例如,在实践中(即忽略 UB 一秒钟)此代码

int num;
int a = num;
int b = num;

does not guarantee that variables a and b will receive identical values.不保证变量ab将获得相同的值。 Interestingly, this is not some pedantic theoretical concept, this readily happens in practice as consequence of optimization.有趣的是,这不是一些迂腐的理论概念,作为优化的结果,这在实践中很容易发生。

So in general, the popular answer that "it is initialized with whatever garbage was in memory" is not even remotely correct.所以总的来说,“它用内存中的任何垃圾进行初始化”的流行答案甚至不正确。 Uninitialized variable's behavior is different from that of a variable initialized with garbage.未初始化的变量的行为是从垃圾初始化变量的不同。

Ubuntu 15.10, Kernel 4.2.0, x86-64, GCC 5.2.1 example Ubuntu 15.10、内核 4.2.0、x86-64、GCC 5.2.1 示例

Enough standards, let's look at an implementation :-)足够的标准,让我们看一个实现:-)

Local variable局部变量

Standards: undefined behavior.标准:未定义的行为。

Implementation: the program allocates stack space, and never moves anything to that address, so whatever was there previously is used.实现:程序分配堆栈空间,并且永远不会将任何内容移动到该地址,因此使用之前存在的任何内容。

#include <stdio.h>
int main() {
    int i;
    printf("%d\n", i);
}

compile with:编译:

gcc -O0 -std=c99 a.c

outputs:输出:

0

and decompiles with:并反编译:

objdump -dr a.out

to:到:

0000000000400536 <main>:
  400536:       55                      push   %rbp
  400537:       48 89 e5                mov    %rsp,%rbp
  40053a:       48 83 ec 10             sub    $0x10,%rsp
  40053e:       8b 45 fc                mov    -0x4(%rbp),%eax
  400541:       89 c6                   mov    %eax,%esi
  400543:       bf e4 05 40 00          mov    $0x4005e4,%edi
  400548:       b8 00 00 00 00          mov    $0x0,%eax
  40054d:       e8 be fe ff ff          callq  400410 <printf@plt>
  400552:       b8 00 00 00 00          mov    $0x0,%eax
  400557:       c9                      leaveq
  400558:       c3                      retq

From our knowledge of x86-64 calling conventions:根据我们对 x86-64 调用约定的了解:

  • %rdi is the first printf argument, thus the string "%d\\n" at address 0x4005e4 %rdi是第一个 printf 参数,因此地址0x4005e4处的字符串"%d\\n"

  • %rsi is the second printf argument, thus i . %rsi是第二个 printf 参数,因此i

    It comes from -0x4(%rbp) , which is the first 4-byte local variable.它来自-0x4(%rbp) ,它是第一个 4 字节局部变量。

    At this point, rbp is in the first page of the stack has been allocated by the kernel, so to understand that value we would to look into the kernel code and find out what it sets that to.此时, rbp位于已由内核分配的堆栈的第一页,因此要了解该值,我们将查看内核代码并找出它设置的内容。

    TODO does the kernel set that memory to something before reusing it for other processes when a process dies? TODO 内核是否会在进程终止时将该内存重新用于其他进程之前将该内存设置为某些内容? If not, the new process would be able to read the memory of other finished programs, leaking data.否则,新进程将能够读取其他已完成程序的内存,从而泄漏数据。 See: Are uninitialized values ever a security risk?请参阅: 未初始化的值是否存在安全风险?

We can then also play with our own stack modifications and write fun things like:然后我们还可以修改我们自己的堆栈并编写有趣的东西,例如:

#include <assert.h>

int f() {
    int i = 13;
    return i;
}

int g() {
    int i;
    return i;
}

int main() {
    f();
    assert(g() == 13);
}

Note that GCC 11 seems to produce a different assembly output, and the above code stops "working", it is undefined behavior after all: Why does -O3 in gcc seem to initialize my local variable to 0, while -O0 does not?请注意,GCC 11 似乎产生了不同的汇编输出,并且上面的代码停止“工作”,毕竟这是未定义的行为: 为什么 gcc 中的 -O3 似乎将我的局部变量初始化为 0,而 -O0 没有?

Local variable in -O3 -O3局部变量

Implementation analysis at: What does <value optimized out> mean in gdb?实现分析在: <value optimization out> 在 gdb 中是什么意思?

Global variables全局变量

Standards: 0标准:0

Implementation: .bss section.实现: .bss部分。

#include <stdio.h>
int i;
int main() {
    printf("%d\n", i);
}

gcc -00 -std=c99 a.c

compiles to:编译为:

0000000000400536 <main>:
  400536:       55                      push   %rbp
  400537:       48 89 e5                mov    %rsp,%rbp
  40053a:       8b 05 04 0b 20 00       mov    0x200b04(%rip),%eax        # 601044 <i>
  400540:       89 c6                   mov    %eax,%esi
  400542:       bf e4 05 40 00          mov    $0x4005e4,%edi
  400547:       b8 00 00 00 00          mov    $0x0,%eax
  40054c:       e8 bf fe ff ff          callq  400410 <printf@plt>
  400551:       b8 00 00 00 00          mov    $0x0,%eax
  400556:       5d                      pop    %rbp
  400557:       c3                      retq
  400558:       0f 1f 84 00 00 00 00    nopl   0x0(%rax,%rax,1)
  40055f:       00

# 601044 <i> says that i is at address 0x601044 and: # 601044 <i>表示i在地址0x601044并且:

readelf -SW a.out

contains:包含:

[25] .bss              NOBITS          0000000000601040 001040 000008 00  WA  0   0  4

which says 0x601044 is right in the middle of the .bss section, which starts at 0x601040 and is 8 bytes long.它说0x601044就在.bss部分的中间,它从0x601040开始, 0x601040为 8 个字节。

The ELF standard then guarantees that the section named .bss is completely filled with of zeros:然后ELF 标准保证名为.bss的部分完全用零填充:

.bss This section holds uninitialized data that contribute to the program's memory image. .bss此部分保存对程序内存映像有贡献的未初始化数据。 By definition, the system initializes the data with zeros when the program begins to run.根据定义,系统在程序开始运行时用零初始化数据。 The section occu- pies no file space, as indicated by the section type, SHT_NOBITS .该节不占用文件空间,如节类型SHT_NOBITS

Furthermore, the type SHT_NOBITS is efficient and occupies no space on the executable file:此外,类型SHT_NOBITS是高效的并且不占用可执行文件的空间:

sh_size This member gives the section's size in bytes. sh_size该成员以字节为单位给出节的大小。 Unless the sec- tion type is SHT_NOBITS , the section occupies sh_size bytes in the file.除非节类型是SHT_NOBITS ,否则该节在文件中占用sh_size字节。 A section of type SHT_NOBITS may have a non-zero size, but it occupies no space in the file. SHT_NOBITS类型的SHT_NOBITS可能具有非零大小,但它不占用文件中的空间。

Then it is up to the Linux kernel to zero out that memory region when loading the program into memory when it gets started.然后由 Linux 内核在程序启动时将其加载到内存中时将该内存区域清零。

That depends.那要看。 If that definition is global (outside any function) then num will be initialized to zero.如果该定义是全局的(在任何函数之外),则num将被初始化为零。 If it's local (inside a function) then its value is indeterminate.如果它是本地的(在函数内),那么它的值是不确定的。 In theory, even attempting to read the value has undefined behavior -- C allows for the possibility of bits that don't contribute to the value, but have to be set in specific ways for you to even get defined results from reading the variable.从理论上讲,即使尝试读取值也有未定义的行为——C 允许对值没有贡献的位的可能性,但必须以特定方式设置,以便您甚至从读取变量中获得定义的结果。

The basic answer is, yes it is undefined.基本的答案是,是的,它是未定义的。

If you are seeing odd behavior because of this, it may depended on where it is declared.如果您因此而看到奇怪的行为,这可能取决于它的声明位置。 If within a function on the stack then the contents will more than likely be different every time the function gets called.如果在堆栈上的函数内,那么每次调用该函数时,内容很可能会有所不同。 If it is a static or module scope it is undefined but will not change.如果它是静态或模块范围,则未定义但不会更改。

Because computers have finite storage capacity, automatic variables will typically be held in storage elements (whether registers or RAM) that have previously been used for some other arbitrary purpose.由于计算机的存储容量有限,因此自动变量通常保存在以前用于其他任意目的的存储元件(无论是寄存器还是 RAM)中。 If a such a variable is used before a value has been assigned to it, that storage may hold whatever it held previously, and so the contents of the variable will be unpredictable.如果一个这样的变量在一个值被分配给它之前被使用,那么该存储可能保存它之前保存的任何内容,因此该变量的内容将是不可预测的。

As an additional wrinkle, many compilers may keep variables in registers which are larger than the associated types.作为一个额外的问题,许多编译器可能会将变量保存在比相关类型大的寄存器中。 Although a compiler would be required to ensure that any value which is written to a variable and read back will be truncated and/or sign-extended to its proper size, many compilers will perform such truncation when variables are written and expect that it will have been performed before the variable is read.尽管编译器需要确保写入变量并读回的任何值都将被截断和/或符号扩展到其适当的大小,但许多编译器将在写入变量时执行此类截断并期望它具有在读取变量之前执行。 On such compilers, something like:在这样的编译器上,类似于:

uint16_t hey(uint32_t x, uint32_t mode)
{ uint16_t q; 
  if (mode==1) q=2; 
  if (mode==3) q=4; 
  return q; }

 uint32_t wow(uint32_t mode) {
   return hey(1234567, mode);
 }

might very well result in wow() storing the values 1234567 into registers 0 and 1, respectively, and calling foo() .很可能会导致wow()将值 1234567 分别存储到寄存器 0 和 1 中,并调用foo() Since x isn't needed within "foo", and since functions are supposed to put their return value into register 0, the compiler may allocate register 0 to q .由于在“foo”中不需要x ,并且由于函数应该将它们的返回值放入寄存器 0 中,因此编译器可能会将寄存器 0 分配给q If mode is 1 or 3, register 0 will be loaded with 2 or 4, respectively, but if it is some other value, the function may return whatever was in register 0 (ie the value 1234567) even though that value is not within the range of uint16_t.如果mode为 1 或 3,寄存器 0 将分别加载 2 或 4,但如果它是其他值,该函数可能会返回寄存器 0 中的任何内容(即值 1234567),即使该值不在uint16_t 的范围。

To avoid requiring compilers to do extra work to ensure that uninitialized variables never seem to hold values outside their domain, and avoid needing to specify indeterminate behaviors in excessive detail, the Standard says that use of uninitialized automatic variables is Undefined Behavior.为了避免要求编译器做额外的工作以确保未初始化的变量似乎永远不会在其域之外保存值,并避免需要过于详细地指定不确定的行为,标准说使用未初始化的自动变量是未定义的行为。 In some cases, the consequences of this may be even more surprising than a value being outside the range of its type.在某些情况下,其结果可能比超出其类型范围的值更令人惊讶。 For example, given:例如,给定:

void moo(int mode)
{
  if (mode < 5)
    launch_nukes();
  hey(0, mode);      
}

a compiler could infer that because invoking moo() with a mode which is greater than 3 will inevitably lead to the program invoking Undefined Behavior, the compiler may omit any code which would only be relevant if mode is 4 or greater, such as the code which would normally prevent the launch of nukes in such cases.编译器可以推断,因为使用大于 3 的模式调用moo()将不可避免地导致程序调用未定义行为,编译器可能会省略任何仅在mode为 4 或更大时才相关的代码,例如代码在这种情况下,这通常会阻止发射核武器。 Note that neither the Standard, nor modern compiler philosophy, would care about the fact that the return value from "hey" is ignored--the act of trying to return it gives a compiler unlimited license to generate arbitrary code.请注意,无论是标准还是现代编译器哲学,都不会关心来自“hey”的返回值被忽略的事实——尝试返回它的行为给了编译器生成任意代码的无限许可。

If storage class is static or global then during loading, the BSS initialises the variable or memory location(ML) to 0 unless the variable is initially assigned some value.如果存储类是静态的或全局的,那么在加载期间, BSS将变量或内存位置 (ML)初始化为 0,除非该变量最初被分配了某个值。 In case of local uninitialized variables the trap representation is assigned to memory location.在局部未初始化变量的情况下,陷阱表示被分配给内存位置。 So if any of your registers containing important info is overwritten by compiler the program may crash.因此,如果您的任何包含重要信息的寄存器被编译器覆盖,程序可能会崩溃。

but some compilers may have mechanism to avoid such a problem.但是一些编译器可能有避免这种问题的机制。

I was working with nec v850 series when i realised There is trap representation which has bit patterns that represent undefined values for data types except for char.当我意识到存在陷阱表示时,我正在使用 nec v850 系列,它具有表示除 char 之外的数据类型的未定义值的位模式。 When i took a uninitialized char i got a zero default value due to trap representation.当我使用一个未初始化的字符时,由于陷阱表示,我得到了一个零默认值。 This might be useful for any1 using necv850es这可能对使用 nev850es 的 any1 有用

If in CI write:如果在CI中写:

int num;

Before I assign anything to num , is the value of num indeterminate?我给你什么前num ,是价值num不确定?

As far as i had gone it is mostly depend on compiler but in general most cases the value is pre assumed as 0 by the compliers.就我而言,它主要取决于编译器,但在一般情况下,编译器预先假定该值为 0。
I got garbage value in case of VC++ while TC gave value as 0. I Print it like below在 VC++ 的情况下,我得到了垃圾值,而 TC 给出的值为 0。我打印如下

int i;
printf('%d',i);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 指向C中已声明但未初始化的变量的指针 - Pointer to declared but uninitialized variable in C 打印已声明但未分配的变量在C中会发生什么? - What happens in C when you print a declared, but unassigned variable? C/C++ 中未初始化的变量会怎样? - What happens to uninitialized variables in C/C++? C中未初始化的局部变量中包含的值到底是什么? - What exactly is the value contained in an uninitialized local variable in C? 未初始化的变量可能具有未定义的值,但是该未定义的值是否具有相同的数据类型? - An uninitialized variable may have an undefined value, but does that undefined value have the same data type? 如果我们为变量分配一个新值,旧值会发生什么? (在 C 中) - If we assign a new value to a variable, what happens to the old value ? (in C) 未初始化的变量如何获得随机值? - How does an uninitialized variable get a random value? 如果我在另一个文件中声明另一个具有相同名称的变量会发生什么? - What happens if i declared another variable with the same name in another file? C中未初始化数组中char的默认值是多少? - What is the default value of a char in an uninitialized array, in C? 在C中,未初始化的布尔数组的默认值是什么? - What is the default value of an uninitialized boolean array, in C?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM