简体   繁体   English

为什么这个简短的程序准确地产生此输出?

[英]Why exactly does this short program produce this output?

I've had a slow weekend, so just for interest I started working through KN King's 'C Programming: A Modern Approach' book today, and started to work through the exercises in the second chapter. 我度过了一个缓慢的周末,所以出于兴趣,我今天开始阅读KN King的“ C编程:一种现代方法”一书,并开始进行第二章的练习。 One of the exercises is this: 练习之一是:

Write a program that declares several int and float variables - without initialising them - and then prints their values. 编写一个程序,声明几个int和float变量-而不初始化它们-然后输出它们的值。

My little solution to this is below, including the output. 我对此的小解决方案如下,包括输出。 It isn't really a problem as such, I'm just quite curious as to why it does what it does, especially since I'm not so well-informed on lower-level languages. 其实这并不是真正的问题,我只是很好奇它为什么会这样做,特别是因为我对底层语言的了解并不多。

I had a quick look for some other pre-made solutions on GitHub hoping they'd be commented or something, but its such a simple problem there really was nothing. 我在GitHub上快速寻找了一些其他预制的解决方案,希望可以对其进行评论或其他操作,但是它是如此简单的问题,实际上没有任何问题。 KN King's own site suggests that the pattern of the output depends on, quote, "many factors", but doesn't divulge any more. KN King自己的站点建议输出的模式取决于(引用)“许多因素”,但不再赘述。 This is reflected in my output being different to King's. 这反映在我的输出与King的输出不同。

#include <stdio.h>

int main()
{
    int num1, num2, num3;    
    float flo1, flo2, flo3;

    printf("Our integers are %d, %d, %d\n", num1, num2, num3); 
    printf("Our floats are %g, %g, %g\n", flo1, flo2, flo3);

    return 0;
}   

The output is below: 输出如下:

C:\C\Intro\exercises>a
Our integers are 0, 16, 0
Our floats are 2.8026e-045, 0, 1.73639e-038

Again, not so much a problem, just curious what this is doing, probably at the hardware level. 再说一次,不是什么大问题,只是很好奇它在做什么,可能是在硬件级别。

Strictly speaking, your code has undefined behaviour , meaning it could do pretty much whatever it pleases. 严格来说,您的代码具有未定义的行为 ,这意味着它可以做任何想做的事。

In practice, your variables live on the stack but are not initialised. 实际上,您的变量存在于堆栈中,但没有初始化。 This likely means they pick up whatever values the stack happens to contain at the locations where the variables get placed by the compiler. 这可能意味着它们在编译器将变量放置到的位置获取堆栈恰好包含的任何值。 Those values are most likely left over from routines that were called earlier in your process's lifetime, ie during its startup. 这些值很可能是在流程生命周期中较早(即在其启动期间)被调用的例程中遗留下来的。

First, let's consider how a very simple compiler might handle this code. 首先,让我们考虑一个非常简单的编译器如何处理此代码。 When it sees int num1, num2, num3; 当它看到int num1, num2, num3; inside a function, it may make space for these on the stack. 在函数内部,可能会在堆栈上为其留出空间。 A stack is commonly how compilers implement objects with automatic storage duration (notably variables defined inside functions that are not static or local to a thread). 堆栈通常是编译器如何实现具有自动存储持续时间的对象(特别是在函数内部定义的,不是static或不是线程局部的变量)。 Whenever a new function is called, the compiler writes code to make space on the stack for its local variables and other information. 每当调用新函数时,编译器都会编写代码以在堆栈上为其局部变量和其他信息腾出空间。 Similarly, space is also allocated for float flo1, flo2, flo3; 同样,还为float flo1, flo2, flo3;分配了空间float flo1, flo2, flo3; .

Then, when the compiler sees printf("Our integers are %d, %d, %d\\n", num1, num2, num3); 然后,当编译器看到printf("Our integers are %d, %d, %d\\n", num1, num2, num3); , it generates code to load the values of num1 , num2 , and num3 and to pass them to printf . ,它将生成代码以加载num1num2num3的值,并将它们传递给printf The values are loaded from the memory that was allocated for these objects. 这些值是从为这些对象分配的内存中加载的。 What is in that memory? 那记忆里有什么? Well, this source code does not assign any values to those objects, so the data in that memory is whatever data was there when the main routine started. 很好,此源代码没有为这些对象分配任何值,因此该内存中的数据与main例程启动时存在的数据相同。

What was in that memory? 那是什么记忆? Commonly, when an operating system provides general memory to a process, it clears the memory (sets all bytes in it to zero) so that it does not reveal any data of whatever program used the memory previously. 通常,当操作系统为进程提供通用内存时,它会清除内存(将其中的所有字节设置为零),以便它不会透露以前使用该内存的任何程序的任何数据。 So why are not the printf statements printing zeros? 那么为什么printf语句不打印零?

main is not actually the start of your program. main实际上不是程序的开始。 Before main can be executed, something has to set up the C environment. 在可以执行main之前,必须先设置C环境。 Running a C program requires that any data used by library routines you might call (such as printf ) be initialized. 运行C程序要求初始化您可能调用的库例程使用的任何数据(例如printf )。 Also, when the main routine returns, it has to have something to return to, something that will take the return value and pass it to the system as a process exit status. 同样,当main例程返回时,它必须要返回一些东西,该东西将采用返回值并将其作为进程退出状态传递给系统。 That code is also responsible for closing open files and doing some other clean-up work. 该代码还负责关闭打开的文件并执行其他一些清理工作。 Commonly, when you link a C program, an extra “start” routine is linked into your executable file. 通常,当您链接C程序时,一个额外的“启动”例程会链接到您的可执行文件中。 When the operating system starts your program, it calls this “start” routine first, and the start routine sets up the C environment and then calls main . 操作系统启动程序时,它将首先调用此“启动”例程,然后启动例程将设置C环境,然后调用main

So, when you print num1 , num2 , num3 , flo1 , flo2 , and flo3 , the memory allocated for them has already been used by the “start” routine, and it contains whatever data the “start” routine happened to leave lying around. 因此,当您打印num1num2num3flo1flo2flo3 ,为它们分配的内存已由“启动”例程使用,并且其中包含“启动”例程碰巧留下的所有数据。

That is one explanation for why you see various values printed by this source code. 这就是为什么您会看到此源代码打印的各种值的一种解释。

On the other hand, let's consider a more sophisticated compiler. 另一方面,让我们考虑一个更复杂的编译器。 A more sophisticated compiler analyzes the code and can see that the variables are used without being initialized. 一个更复杂的编译器分析代码,可以看到未初始化就使用了变量。 It will warn the user about this, and it also knows that this violates various rules in C. In particular, the C standard does not define what happens when you use an object with automatic storage duration that has been neither initialized nor (for technical/esoteric reasons) had its address taken. 它将向用户发出警告,并且它也知道这违反了C中的各种规则。特别是,当您使用既没有初始化也没有自动存储期限的对象(对于技术/原因)。

To assist with optimization, sophisticated compilers have special ways of dealing with undefined behavior. 为了帮助优化,复杂的编译器提供了处理未定义行为的特殊方法。 For example, if the compiler sees code such as: 例如,如果编译器看到如下代码:

if (some test)
    FunctionA();
else
{
    Some undefined behavior here…
    FunctionB();
}

the compiler can optimize this by “choosing” how to define the undefined behavior. 编译器可以通过“选择”如何定义未定义的行为来对此进行优化。 It can define the behavior to alter the program as if it had been written: 它可以定义更改程序的行为,就像编写程序一样:

if (some test)
    FunctionA();
else
{
    FunctionA();
}

because that is a valid instance of undefined behavior. 因为那是未定义行为的有效实例。 Then optimization can proceed to simplify that to: 然后可以进行优化以简化为:

FunctionA();

Sometimes cases like this arise in code because a programmer was writing for portability to various environments, and it happens that some test indeed cannot be false in a particular compiler, and this optimization produces correct and simple code. 有时在代码中会出现这样的情况,因为程序员是在为可移植到各种环境而编写的,并且碰巧some test在特定的编译器中的确不可能是错误的,并且这种优化产生了正确而简单的代码。 Cases like this can also arise where a compiler has been transforming code in other ways and the code above arises not because it was literally written that way in the source code but was generated by the compiler during its internal transformations. 类似的情况也可能出现,其中编译器已经以其他方式转换代码,而上面的代码并不是因为它是在源代码中按原样编写的,而是由编译器在其内部转换期间生成的。 For example, a compiler might split a loop into separate code for the first iteration, the general middle iterations, and the last iteration, and some test might be always true in the last iteration, even though it was not always true in the context where the programmer wrote it. 例如,对于第一次迭代,一般的中间迭代和最后一次迭代,编译器可能会将循环拆分为单独的代码,并且some test在最后一次迭代中可能始终是正确的,即使在上下文中并非总是如此。程序员写的。

What this means is that, when you use undefined behavior (that is not only undefined according to the C standard but also not defined by the C implementation), it may be transformed in ways you do not expect. 这意味着,当您使用未定义的行为(不仅根据C标准未定义,而且未由C实现定义)时,它可能会以您不希望的方式进行转换。

I tested this code with a version of LLVM and Clang, and the compiler optimized it by not allocating any memory for the variables and not loading them from memory to pass to printf . 我使用LLVM和Clang版本测试了此代码,并且编译器通过不为变量分配任何内存并且不从内存加载它们以传递给printf Instead, it just called printf without any preparation for those arguments. 相反,它只是在没有为这些参数做任何准备的情况下调用了printf In the platform I am using, those arguments are passed in registers. 在我使用的平台中,这些参数在寄存器中传递。 So the result is that printf prints whatever values happen to be in those registers. 因此,结果是printf打印出那些寄存器中的任何值。 As with the memory, this will be whatever data happened to be left in that memory by earlier software. 与内存一样,这将是早期软件恰好保留在该内存中的所有数据。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM