简体   繁体   English

GCC为什么以及如何通过缺少return语句来编译函数?

[英]Why and how does GCC compile a function with a missing return statement?

#include <stdio.h>

char toUpper(char);

int main(void)
{
    char ch, ch2;
    printf("lowercase input : ");
    ch = getchar();
    ch2 = toUpper(ch);
    printf("%c ==> %c\n", ch, ch2);

    return 0;
}

char toUpper(char c)
{
    if(c>='a'&&c<='z')
        c = c - 32;
}

In toUpper function, return type is char, but there is no "return" in toUpper(). 在toUpper函数中,返回类型为char,但是toUpper()中没有“ return”。 And compile the source code with gcc (GCC) 4.5.1 20100924 (Red Hat 4.5.1-4), fedora-14. 并使用gcc(GCC)4.5.1 20100924(Red Hat 4.5.1-4),fedora-14编译源代码。

Of course, warning is issued: "warning: control reaches end of non-void function", but, working well. 当然,会发出警告:“警告:控制到达非无效功能的尽头”,但是效果很好。

What has happened in that code during compile with gcc? 用gcc编译期间该代码中发生了什么? I want to get a solid answer in this case. 在这种情况下,我想得到一个可靠的答案。 Thanks :) 谢谢 :)

What happened for you is that when the C program was compiled into assembly language, your toUpper function ended up like this, perhaps: 您遇到的是,当C程序被编译为汇编语言时,您的toUpper函数最终像这样结束:

_toUpper:
LFB4:
        pushq   %rbp
LCFI3:
        movq    %rsp, %rbp
LCFI4:
        movb    %dil, -4(%rbp)
        cmpb    $96, -4(%rbp)
        jle     L8
        cmpb    $122, -4(%rbp)
        jg      L8
        movzbl  -4(%rbp), %eax
        subl    $32, %eax
        movb    %al, -4(%rbp)
L8:
        leave
        ret

The subtraction of 32 was carried out in the %eax register. 在%eax寄存器中进行32的减法。 And in the x86 calling convention, that is the register in which the return value is expected to be! 在x86调用约定中,这是期望返回值的寄存器! So... you got lucky. 所以...你很幸运。

But please pay attention to the warnings. 但是请注意警告。 They are there for a reason! 他们在那里是有原因的!

It depends on the Application Binary Interface and which registers are used for the computation. 它取决于应用程序二进制接口以及用于计算的寄存器。

Eg on x86, the first function parameter and the return value is stored in EAX and so gcc is most likely using this to store the result of the calculation as well. 例如在x86上,第一个函数参数和返回值存储在EAX ,因此gcc最有可能也使用它来存储计算结果。

Essentially, c is pushed into the spot that should later be filled with the return value; 本质上,将c推送到应该稍后用返回值填充的位置; since it's not overwritten by use of return , it ends up as the value returned. 由于不会被return覆盖,因此最终会返回为值。

Note that relying on this (in C, or any other language where this isn't an explicit language feature, like Perl), is a Bad Idea™. 请注意,依赖于此(使用C语言或其他不是显式语言功能的语言,例如Perl)是Bad Idea™。 In the extreme. 在极端。

One missing thing that's important to understand is that it's rarely a diagnosable error to omit a return statement. 需要了解的一件事很重要,那就是省略return语句很少是可诊断的错误。 Consider this function: 考虑以下功能:

int f(int x)
{
    if (x!=42) return x*x;
}

As long as you never call it with an argument of 42, a program containing this function is perfectly valid C and does not invoke any undefined behavior, despite the fact that it would invoke UB if you called f(42) and subsequently attempted to use the return value. 只要你永远不与42的参数调用它,包含此功能的程序是完全合法的C和,不调用任何不确定的行为,尽管事实上,它调用UB如果你叫f(42)并随后试图使用返回值。

As such, while it's possible for a compiler to provide warning heuristics for missing return statements, it's impossible to do so without false positives or false negatives. 这样,尽管编译器有可能为缺少的return语句提供警告启发式,但如果没有误报或误报,就不可能这样做。 This is a consequence of the impossibility of solving the halting problem. 这是不可能解决暂停问题的结果。

I can't tell you the specifics of your platform as I don't know it but there is a general answer to the behaviour you see. 我不知道您平台的细节,因为我不知道,但是您所看到的行为有一个普遍的答案。

When the some function that has a return is compiled, the compiler will use a convention on how to return that data. 编译具有返回值的某个函数时,编译器将使用有关如何返回该数据的约定。 It could be a machine register, or a defined memory location such as via a stack or whatever (though generally machine registers are used). 它可以是机器寄存器,也可以是定义的内存位置,例如通过堆栈或其他任何方式(尽管通常使用机器寄存器)。 The compiled code may also use that location (register or otherwise) while doing the work of the function. 编译的代码在执行功能时也可以使用该位置(注册或其他方式)。

If the function doesn't return anything, then the compiler will not generate code that explicitly fills that location with a return value. 如果函数不返回任何内容,则编译器将不会生成用返回值显式填充该位置的代码。 However like I said above it may use that location during the function. 但是,就像我上面说的那样,它可能在功能期间使用该位置。 When you write code that reads the return value (ch2 = toUpper(ch);) , the compiler will write code that uses its convention on how retrieve that return from the conventional location. 当您编写读取返回值的代码(ch2 = toUpper(ch);) ,编译器将编写使用其约定的代码,说明如何从常规位置检索返回值。 As far as the caller code is concerned it will just read that value from the location, even if nothing was written explicitly there. 就调用者代码而言,即使未在其中明确写入任何内容,它也只会从该位置读取该值。 Hence you get a value. 因此,您获得了价值。

Now look at @Ray's example, the compiler used the EAX register, to store the results of the upper casing operation. 现在来看@Ray的示例,编译器使用EAX寄存器存储上套管操作的结果。 It just so happens, this is probably the location that return values are written to. 碰巧的是,这可能是返回值被写入的位置。 On the calling side ch2 is loaded with the value that's in EAX - hence a phantom return. 在调用方ch2上加载了EAX中的值-因此是幻像返回。 This is only true of the x86 range of processors, as on other architectures the compiler may use a completely different scheme in deciding how the convention should be organised 这仅适用于x86系列处理器,因为在其他体系结构上,编译器可能会使用完全不同的方案来决定如何组织约定。

However good compilers will try optimise according to set of local conditions, knowledge of code, rules, and heuristics. 但是,优秀的编译器将根据局部条件,代码知识,规则和启发式方法尝试进行优化。 So an important thing to note is that this is just luck that it works. 因此要注意的重要一点是,这只是运气。 The compiler could optimise and not do this or whatever - you should not reply on the behaviour. 编译器可以优化而不执行此操作或执行其他操作-您不应就此行为进行回复。

You should keep in mind that such code may crash depending on compiler. 您应该记住,取决于编译器,此类代码可能会崩溃。 For example, clang generates ud2 instruction at the end of such function and your app will crash at run-time. 例如,clang在此类函数的末尾生成ud2指令,您的应用程序将在运行时崩溃。

I have tried a small programm: 我尝试了一个小程序:

#include <stdio.h>
int f1() {
}
int main() {
    printf("TEST: <%d>\n",  f1());
    printf("TEST: <%d>\n",  f1());
    printf("TEST: <%d>\n",  f1());
    printf("TEST: <%d>\n",  f1());
    printf("TEST: <%d>\n",  f1());
}

Result: 结果:

TEST: <1> 测试:<1>

TEST: <10> 测试:<10>

TEST: <11> 测试:<11>

TEST: <11> 测试:<11>

TEST: <11> 测试:<11>

I have used mingw32-gcc compiler, so there might be diferences. 我使用了mingw32-gcc编译器,因此可能会有差异。

You could just play around and try eg a char function. 您可以试玩一下,例如使用char函数。 As long you don't use the result value it will stil work fine. 只要您不使用结果值,它就可以正常工作。

#include <stdio.h>
char f1() {
}
int main() {
    f1();
}

But I stil would recommend to set either void function or give some return value. 但是我仍然建议设置void函数或提供一些返回值。

Your function seem to need a return: 您的函数似乎需要返回:

char toUpper(char c)
{
    if(c>='a'&&c<='z')
        c = c - 32;
    return c;
}

There are no local variables, so the value on the top of the stack at the end of the function will be the parameter c. 没有局部变量,因此函数末尾堆栈顶部的值将是参数c。 The value at the top of the stack upon exiting, is the return value. 退出时位于堆栈顶部的值是返回值。 So whatever c holds, that's the return value. 因此,无论c保持多少,这就是返回值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM