简体   繁体   中英

compile with GCC -O2 option generate different program

I heard that C compiler with/without optimization option may generate different program(compiling the program with optimizations causes it to behave differently), but I never encountered such case. Anyone can give simple example to show this?

For gcc 4.4.4, this differs with -O0 and -O2

void foo(int i) {
  foo(i+1);
}

main() {
  foo(0);
}

With optimizations this loops forever. Without optimizations, it crashes (stack overflow!)

Other and more realistic variants would typically be dependent on timing, vulnerable to float exactness variations, or depending on undefined behavior (uninitialized variables, heap/stack layout)

If you look at the assembly generated by this code :

int main ()
{
    int i = 1;
    while (i) ;
    return 0;
}

Whitout the -O2 flag :

 .file   "test.c"
    .text
.globl main
    .type   main, @function
main:
    pushl   %ebp
    movl    %esp, %ebp
    subl    $16, %esp
    movl    $1, -4(%ebp)
.L2:
    cmpl    $0, -4(%ebp)
    jne .L2
    movl    $0, %eax
    leave
    ret
    .size   main, .-main
    .ident  "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"
    .section    .note.GNU-stack,"",@progbits

With the -O2 flag :

 .file   "test.c"
    .text
    .p2align 4,,15
.globl main
    .type   main, @function
main:
    pushl   %ebp
    movl    %esp, %ebp
.L2:
    jmp .L2
    .size   main, .-main
    .ident  "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"
    .section    .note.GNU-stack,"",@progbits

With the -O2 flag, the declaration of i and the return value are ommited and you only have a label with a jump on this same label to constitute the infinite loop.

Without the -O2 flag, you can clearly see the allocation of the i space on the stack ( subl $16, %esp ) and initialization ( movl $1, -4(%ebp) ) as well as the evaluation of the while condition ( cmpl $0, -4(%ebp) ) and the return value of the main function ( movl $0, %eax ).

I've seen it in programs that do a lot of math near the floating point precision limit. At the limit, arithmetic is not associative, so if operations are performed in slightly different orders, you can get slightly different answers. Also, if the floating-point chip with 80-bit doubles is used, but the results are stored in 64-bit double precision variables, information can get lost, so the sequence of operations affects the results.

Optimization are using assumptions about

  • absence of pointer aliasing in some situations (meaning it can keep stuff in registers without worrying about modification through another reference)
  • non volatility of memory locations in general

It is also because of this that you can get warnings like

 Type-punned pointers may break strict aliasing rules... (paraphrased)

Warnings like these are intended to save you from headaches when your code develops subtle bugs when compiling wit& optimization on.

In general, in c and C++

  • be very sure you know what you are doing
  • never play it loosely (don't cast char** directly to char*, etc)
  • use const, volatile, throw(), dutifully
  • trust your compiler vendor (or devs) or build -O0

I'm sure I missed the epics but you get the drift.

typed on my htc. Excuse a typo or two

The difference between optimization levels usually stems from uninitialized variables. For example:

#include <stdio.h>

int main()
{
    int x;
    printf("%d\n", x);
    return 0;
}

When compiled with -O0 , outputs 5895648 . When compiled with -O2 , outputs a different number each time i run it; for example, -1077877612 .

The difference can be more subtle; imagine you have the following code:

int x; // uninitialized
if (x % 10 == 8)
    printf("Go east\n");
else
    printf("Go west\n");

With -O0 , this will output Go east , and with -O2 , (usually) Go west .

Examples of correct programs that have different outputs on different levels of optimizations could be found in bug submission reports, and they would "work" only on specific versions of GCC.

But it would be easy to achieve it by invoking UB. However, it won't be a correct program anymore, and could also generate different outputs with different versions of GCC (among other things, see mythology ).

It is rare that find a case where -O2 does not generate a different result than not using optimization.

unsigned int fun ( unsigned int a )
{
   return(a+73);
}

Without optimization:

fun:
    str fp, [sp, #-4]!
    .save {fp}
    .setfp fp, sp, #0
    add fp, sp, #0
    .pad #12
    sub sp, sp, #12
    str r0, [fp, #-8]
    ldr r3, [fp, #-8]
    add r3, r3, #73
    mov r0, r3
    add sp, fp, #0
    ldmfd   sp!, {fp}
    bx  lr

with optimization:

fun:
    add r0, r0, #73
    bx  lr

Even this function:

void fun ( void )
{
}

Without optimization:

fun:
    str fp, [sp, #-4]!
    .save {fp}
    .setfp fp, sp, #0
    add fp, sp, #0
    add sp, fp, #0
    ldmfd   sp!, {fp}
    bx  lr

With optimization:

fun:
    bx  lr

If you declared everything volatile and created a need for the frame pointer, you might approach something where unoptimized and optimized were the same. Likewise if you compiled a debuggable version (not sure what that switch is), that will behave as if everything is volatile so that you can use a debugger to watch variables in memory and single step. that might also approach the same output from the same input.

Also note that with or without optimization, it is expected to see different output from the same source code from different compilers, even different major versions of gcc produce different results. Trivial functions like those above will normally produce the same results with optimization by many compilers. But more complicated functions with many more variables can be expected to produce different results from compiler to compiler.

The following code outputs Here i am when compiled without optimizations but nothing when compiled with optimizations.

The idea is that the function x() is specified as "pure" (having no side effects), so the compiler can optimize it out (my compiler is gcc 4.1.2 ).

#include <stdio.h>

int x() __attribute__ ((pure));

int x()
{
    return printf("Here i am!\n");
}

int main()
{
    int y = x();
    return 0;
}

One answer to this question could be:

Every ANSI C compiler is required to support at least:

  • 31 parameters in a function definition
  • 31 arguments in a function call
  • 509 characters in a source line
  • 32 levels of nested parentheses in an expression
  • The maximum value of long int can't be any less than 2,147,483,647, (ie, long integers are at least 32 bits).

Source: Expert C Programming - Peter van den Linden

It could be that the compiler supports maybe 31 parameters in a function definition for -O0 and 35 for -O3, this is because there is no specification for this. Personally I think this should be a flaw design and very improvable. But in short: there are things in a compiler that are not bounded by standards and can change in the implementation that including the optimization levels.

Hope this helps ans as Mark Loeser said, you should be more specific in your question.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM