简体   繁体   中英

Why and how does GCC compile a function with a missing return statement?

#include <stdio.h>

char toUpper(char);

int main(void)
{
    char ch, ch2;
    printf("lowercase input : ");
    ch = getchar();
    ch2 = toUpper(ch);
    printf("%c ==> %c\n", ch, ch2);

    return 0;
}

char toUpper(char c)
{
    if(c>='a'&&c<='z')
        c = c - 32;
}

In toUpper function, return type is char, but there is no "return" in toUpper(). And compile the source code with gcc (GCC) 4.5.1 20100924 (Red Hat 4.5.1-4), fedora-14.

Of course, warning is issued: "warning: control reaches end of non-void function", but, working well.

What has happened in that code during compile with gcc? I want to get a solid answer in this case. Thanks :)

What happened for you is that when the C program was compiled into assembly language, your toUpper function ended up like this, perhaps:

_toUpper:
LFB4:
        pushq   %rbp
LCFI3:
        movq    %rsp, %rbp
LCFI4:
        movb    %dil, -4(%rbp)
        cmpb    $96, -4(%rbp)
        jle     L8
        cmpb    $122, -4(%rbp)
        jg      L8
        movzbl  -4(%rbp), %eax
        subl    $32, %eax
        movb    %al, -4(%rbp)
L8:
        leave
        ret

The subtraction of 32 was carried out in the %eax register. And in the x86 calling convention, that is the register in which the return value is expected to be! So... you got lucky.

But please pay attention to the warnings. They are there for a reason!

It depends on the Application Binary Interface and which registers are used for the computation.

Eg on x86, the first function parameter and the return value is stored in EAX and so gcc is most likely using this to store the result of the calculation as well.

Essentially, c is pushed into the spot that should later be filled with the return value; since it's not overwritten by use of return , it ends up as the value returned.

Note that relying on this (in C, or any other language where this isn't an explicit language feature, like Perl), is a Bad Idea™. In the extreme.

One missing thing that's important to understand is that it's rarely a diagnosable error to omit a return statement. Consider this function:

int f(int x)
{
    if (x!=42) return x*x;
}

As long as you never call it with an argument of 42, a program containing this function is perfectly valid C and does not invoke any undefined behavior, despite the fact that it would invoke UB if you called f(42) and subsequently attempted to use the return value.

As such, while it's possible for a compiler to provide warning heuristics for missing return statements, it's impossible to do so without false positives or false negatives. This is a consequence of the impossibility of solving the halting problem.

I can't tell you the specifics of your platform as I don't know it but there is a general answer to the behaviour you see.

When the some function that has a return is compiled, the compiler will use a convention on how to return that data. It could be a machine register, or a defined memory location such as via a stack or whatever (though generally machine registers are used). The compiled code may also use that location (register or otherwise) while doing the work of the function.

If the function doesn't return anything, then the compiler will not generate code that explicitly fills that location with a return value. However like I said above it may use that location during the function. When you write code that reads the return value (ch2 = toUpper(ch);) , the compiler will write code that uses its convention on how retrieve that return from the conventional location. As far as the caller code is concerned it will just read that value from the location, even if nothing was written explicitly there. Hence you get a value.

Now look at @Ray's example, the compiler used the EAX register, to store the results of the upper casing operation. It just so happens, this is probably the location that return values are written to. On the calling side ch2 is loaded with the value that's in EAX - hence a phantom return. This is only true of the x86 range of processors, as on other architectures the compiler may use a completely different scheme in deciding how the convention should be organised

However good compilers will try optimise according to set of local conditions, knowledge of code, rules, and heuristics. So an important thing to note is that this is just luck that it works. The compiler could optimise and not do this or whatever - you should not reply on the behaviour.

You should keep in mind that such code may crash depending on compiler. For example, clang generates ud2 instruction at the end of such function and your app will crash at run-time.

I have tried a small programm:

#include <stdio.h>
int f1() {
}
int main() {
    printf("TEST: <%d>\n",  f1());
    printf("TEST: <%d>\n",  f1());
    printf("TEST: <%d>\n",  f1());
    printf("TEST: <%d>\n",  f1());
    printf("TEST: <%d>\n",  f1());
}

Result:

TEST: <1>

TEST: <10>

TEST: <11>

TEST: <11>

TEST: <11>

I have used mingw32-gcc compiler, so there might be diferences.

You could just play around and try eg a char function. As long you don't use the result value it will stil work fine.

#include <stdio.h>
char f1() {
}
int main() {
    f1();
}

But I stil would recommend to set either void function or give some return value.

Your function seem to need a return:

char toUpper(char c)
{
    if(c>='a'&&c<='z')
        c = c - 32;
    return c;
}

There are no local variables, so the value on the top of the stack at the end of the function will be the parameter c. The value at the top of the stack upon exiting, is the return value. So whatever c holds, that's the return value.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM