简体   繁体   English

用GCC编译-O2选项生成不同的程序

[英]compile with GCC -O2 option generate different program

I heard that C compiler with/without optimization option may generate different program(compiling the program with optimizations causes it to behave differently), but I never encountered such case. 我听说带有/不带优化选项的C编译器可能会生成不同的程序(用优化编译程序会导致它表现不同),但我从未遇到过这样的情况。 Anyone can give simple example to show this? 任何人都可以给出简单的例子来展示这个

For gcc 4.4.4, this differs with -O0 and -O2 对于gcc 4.4.4,这与-O0-O2不同

void foo(int i) {
  foo(i+1);
}

main() {
  foo(0);
}

With optimizations this loops forever. 通过优化,这将永远循环。 Without optimizations, it crashes (stack overflow!) 没有优化,它崩溃(堆栈溢出!)

Other and more realistic variants would typically be dependent on timing, vulnerable to float exactness variations, or depending on undefined behavior (uninitialized variables, heap/stack layout) 其他更现实的变体通常取决于时序,易受浮点精确度变化的影响,或取决于未定义的行为(未初始化的变量,堆/堆栈布局)

If you look at the assembly generated by this code : 如果查看此代码生成的程序集:

int main ()
{
    int i = 1;
    while (i) ;
    return 0;
}

Whitout the -O2 flag : Whitout -O2标志:

 .file   "test.c"
    .text
.globl main
    .type   main, @function
main:
    pushl   %ebp
    movl    %esp, %ebp
    subl    $16, %esp
    movl    $1, -4(%ebp)
.L2:
    cmpl    $0, -4(%ebp)
    jne .L2
    movl    $0, %eax
    leave
    ret
    .size   main, .-main
    .ident  "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"
    .section    .note.GNU-stack,"",@progbits

With the -O2 flag : 使用-O2标志:

 .file   "test.c"
    .text
    .p2align 4,,15
.globl main
    .type   main, @function
main:
    pushl   %ebp
    movl    %esp, %ebp
.L2:
    jmp .L2
    .size   main, .-main
    .ident  "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"
    .section    .note.GNU-stack,"",@progbits

With the -O2 flag, the declaration of i and the return value are ommited and you only have a label with a jump on this same label to constitute the infinite loop. 使用-O2标志,可以省略i的声明和返回值,并且只有一个标签在同一标签上跳转以构成无限循环。

Without the -O2 flag, you can clearly see the allocation of the i space on the stack ( subl $16, %esp ) and initialization ( movl $1, -4(%ebp) ) as well as the evaluation of the while condition ( cmpl $0, -4(%ebp) ) and the return value of the main function ( movl $0, %eax ). 如果没有-O2标志,您可以清楚地看到堆栈上的i空间分配( subl $16, %esp )和初始化( movl $1, -4(%ebp) )以及while条件的评估( cmpl $0, -4(%ebp) )和main函数的返回值( movl $0, %eax )。

I've seen it in programs that do a lot of math near the floating point precision limit. 我在程序中看到它在浮点精度限制附近做了很多数学运算。 At the limit, arithmetic is not associative, so if operations are performed in slightly different orders, you can get slightly different answers. 在极限情况下,算术不是关联的,因此如果以稍微不同的顺序执行操作,则可以获得稍微不同的答案。 Also, if the floating-point chip with 80-bit doubles is used, but the results are stored in 64-bit double precision variables, information can get lost, so the sequence of operations affects the results. 此外,如果使用具有80位双精度的浮点芯片,但结果存储在64位双精度变量中,则信息可能会丢失,因此操作序列会影响结果。

Optimization are using assumptions about 优化正在使用关于的假设

  • absence of pointer aliasing in some situations (meaning it can keep stuff in registers without worrying about modification through another reference) 在某些情况下没有指针别名(意味着它可以将东西保存在寄存器中而不用担心通过另一个引用进行修改)
  • non volatility of memory locations in general 一般来说,内存位置不会波动

It is also because of this that you can get warnings like 也正是因为这样你才能得到警告

 Type-punned pointers may break strict aliasing rules... (paraphrased)

Warnings like these are intended to save you from headaches when your code develops subtle bugs when compiling wit& optimization on. 这些警告旨在帮助您避免在编译智能和优化时代码产生细微错误时头痛。

In general, in c and C++ 一般来说,在c和C ++中

  • be very sure you know what you are doing 非常确定你知道自己在做什么
  • never play it loosely (don't cast char** directly to char*, etc) 永远不要松散地玩(不要将char **直接转换为char *等)
  • use const, volatile, throw(), dutifully 使用const,volatile,throw(),尽职尽责
  • trust your compiler vendor (or devs) or build -O0 信任您的编译器供应商(或开发人员)或构建-O0

I'm sure I missed the epics but you get the drift. 我确定我错过了史诗,但你得到了漂移。

typed on my htc. 输入我的HTC。 Excuse a typo or two 原谅一两个错字

The difference between optimization levels usually stems from uninitialized variables. 优化级别之间的差异通常源于未初始化的变量。 For example: 例如:

#include <stdio.h>

int main()
{
    int x;
    printf("%d\n", x);
    return 0;
}

When compiled with -O0 , outputs 5895648 . 使用-O0编译时,输出5895648 When compiled with -O2 , outputs a different number each time i run it; 使用-O2编译时,每次运行时输出不同的数字; for example, -1077877612 . 例如, -1077877612

The difference can be more subtle; 差异可能更微妙; imagine you have the following code: 想象你有以下代码:

int x; // uninitialized
if (x % 10 == 8)
    printf("Go east\n");
else
    printf("Go west\n");

With -O0 , this will output Go east , and with -O2 , (usually) Go west . 使用-O0 ,这将输出Go east ,并使用-O2 ,(通常) Go west

Examples of correct programs that have different outputs on different levels of optimizations could be found in bug submission reports, and they would "work" only on specific versions of GCC. 可以在错误提交报告中找到在不同优化级别上具有不同输出的正确程序的示例,并且它们仅在特定版本的GCC上“起作用”。

But it would be easy to achieve it by invoking UB. 但是通过调用UB很容易实现它。 However, it won't be a correct program anymore, and could also generate different outputs with different versions of GCC (among other things, see mythology ). 但是,它不再是一个正确的程序,并且还可以使用不同版本的GCC生成不同的输出(除其他外,请参见神话 )。

It is rare that find a case where -O2 does not generate a different result than not using optimization. 很少发现-O2不会产生与不使用优化不同的结果的情况。

unsigned int fun ( unsigned int a )
{
   return(a+73);
}

Without optimization: 没有优化:

fun:
    str fp, [sp, #-4]!
    .save {fp}
    .setfp fp, sp, #0
    add fp, sp, #0
    .pad #12
    sub sp, sp, #12
    str r0, [fp, #-8]
    ldr r3, [fp, #-8]
    add r3, r3, #73
    mov r0, r3
    add sp, fp, #0
    ldmfd   sp!, {fp}
    bx  lr

with optimization: 优化:

fun:
    add r0, r0, #73
    bx  lr

Even this function: 甚至这个功能:

void fun ( void )
{
}

Without optimization: 没有优化:

fun:
    str fp, [sp, #-4]!
    .save {fp}
    .setfp fp, sp, #0
    add fp, sp, #0
    add sp, fp, #0
    ldmfd   sp!, {fp}
    bx  lr

With optimization: 通过优化:

fun:
    bx  lr

If you declared everything volatile and created a need for the frame pointer, you might approach something where unoptimized and optimized were the same. 如果你声明一切都是易变的并且需要帧指针,你可能会接近未经优化和优化的东西。 Likewise if you compiled a debuggable version (not sure what that switch is), that will behave as if everything is volatile so that you can use a debugger to watch variables in memory and single step. 同样,如果你编译了一个可调试版本(不确定那个开关是什么),那就好像一切都是易失性的,这样你就可以使用调试器来监视内存中的变量并单步执行。 that might also approach the same output from the same input. 也可能从同一输入接近相同的输出。

Also note that with or without optimization, it is expected to see different output from the same source code from different compilers, even different major versions of gcc produce different results. 还要注意,无论是否进行优化,都会看到来自不同编译器的相同源代码的不同输出,甚至不同的主要版本的gcc也会产生不同的结果。 Trivial functions like those above will normally produce the same results with optimization by many compilers. 像上面那些简单的函数通常会产生与许多编译器优化相同的结果。 But more complicated functions with many more variables can be expected to produce different results from compiler to compiler. 但是,具有更多变量的更复杂的函数可能会产生从编译器到编译器的不同结果。

The following code outputs Here i am when compiled without optimizations but nothing when compiled with optimizations. 下面的代码输出Here i am在没有优化的情况下编译,但在使用优化编译时没有。

The idea is that the function x() is specified as "pure" (having no side effects), so the compiler can optimize it out (my compiler is gcc 4.1.2 ). 想法是函数x()被指定为“纯”(没有副作用),因此编译器可以优化它(我的编译器是gcc 4.1.2 )。

#include <stdio.h>

int x() __attribute__ ((pure));

int x()
{
    return printf("Here i am!\n");
}

int main()
{
    int y = x();
    return 0;
}

One answer to this question could be: 这个问题的一个答案可能是:

Every ANSI C compiler is required to support at least: 每个ANSI C编译器至少需要支持:

  • 31 parameters in a function definition 功能定义中的31个参数
  • 31 arguments in a function call 函数调用中的31个参数
  • 509 characters in a source line 源行中的509个字符
  • 32 levels of nested parentheses in an expression 表达式中32个嵌套括号的级别
  • The maximum value of long int can't be any less than 2,147,483,647, (ie, long integers are at least 32 bits). long int的最大值不能小于2,147,483,647(即长整数至少为32位)。

Source: Expert C Programming - Peter van den Linden 资料来源:专家C编程 - Peter van den Linden

It could be that the compiler supports maybe 31 parameters in a function definition for -O0 and 35 for -O3, this is because there is no specification for this. 可能是编译器在-O0的函数定义中支持31个参数,在-O3的函数定义中支持35个,这是因为没有针对此的规范。 Personally I think this should be a flaw design and very improvable. 我个人认为这应该是一个缺陷设计,非常可以改进。 But in short: there are things in a compiler that are not bounded by standards and can change in the implementation that including the optimization levels. 但简而言之:编译器中的某些东西不受标准限制,可以在包括优化级别在内的实现中进行更改。

Hope this helps ans as Mark Loeser said, you should be more specific in your question. 希望这有助于像Mark Loeser所说的那样,你应该在你的问题中更加具体。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM