[英]Why does gcc create redundant assembly code?
I wanted to look into how certain C/C++ features were translated into assembly and I created the following file: 我想研究如何将某些C / C ++功能转换为程序集,并创建了以下文件:
struct foo {
int x;
char y[0];
};
char *bar(struct foo *f)
{
return f->y;
}
I then compiled this with gcc -S
(and also tried with g++ -S
) but when I looked at the assembly code, I was disappointed to find a trivial redundancy in the bar function that I thought gcc
should be able to optimize away: 然后,我用
gcc -S
编译了该代码(并且还尝试了g++ -S
),但是当我查看汇编代码时,我很失望地在bar函数中发现微不足道的冗余,我认为gcc
应该可以优化它:
_bar:
Leh_func_begin1:
pushq %rbp
Ltmp0:
movq %rsp, %rbp
Ltmp1:
movq %rdi, -8(%rbp)
movq -8(%rbp), %rax
movabsq $4, %rcx
addq %rcx, %rax
movq %rax, -24(%rbp)
movq -24(%rbp), %rax
movq %rax, -16(%rbp)
movq -16(%rbp), %rax
popq %rbp
ret
Leh_func_end1:
Among other things, the lines 除其他外,线
movq %rax, -24(%rbp)
movq -24(%rbp), %rax
movq %rax, -16(%rbp)
movq -16(%rbp), %rax
seem pointlessly redundant. 似乎毫无意义地多余。 Is there any reason gcc (and possibly other compilers) cannot/does not optimize this away?
gcc(可能还有其他编译器)是否有任何理由不能/不对此进行优化?
I thought gcc should be able to optimize away.
我认为gcc应该可以优化。
From the gcc manual : 从gcc手册 :
Without any optimization option, the compiler's goal is to reduce the cost of compilation and to make debugging produce the expected results.
没有任何优化选项,编译器的目标是降低编译成本并使调试产生预期的结果。
In other words, it doesn't optimize unless you ask it to. 换句话说,除非您要求,否则它不会优化。 When I turn on optimizations using the
-O3
flag, gcc 4.4.6 produces much more efficient code: 当我使用
-O3
标志打开优化时,gcc 4.4.6会产生效率更高的代码:
bar:
.LFB0:
.cfi_startproc
leaq 4(%rdi), %rax
ret
.cfi_endproc
For more details, see Options That Control Optimization in the manual. 有关更多详细信息,请参见手册中控制优化的选项 。
The code the compiler generates without optimization is typically a straight instruction-by-instruction translation, and the instructions are not those of the program but those of an intermediate representation in which redundancy may have been introduced. 编译器在没有优化的情况下生成的代码通常是直接的逐条指令翻译,这些指令不是程序的指令,而是可能已引入冗余的中间表示形式的指令。
If you expect assembly without such redundant instructions, use gcc -O -S
如果希望汇编时没有这些多余的指令,请使用
gcc -O -S
The kind of optimization you were expecting is called peephole optimization . 您期望的那种优化称为窥孔优化 。 Compilers usually have plenty of these, because unlike more global optimizations, they are cheap to apply and (generally) do not risk making the code worse—if applied towards the end of the compilation, at least.
编译器通常具有很多这样的功能,因为与更多的全局优化不同,它们的使用成本低廉,并且(通常)不会冒使代码恶化的风险-至少在编译结束时才应用。
In this blog post , I provide an example where both GCC and Clang may go as far as generating shorter 32-bit instructions when the integer type in the source code is 64-bit but only the lowest 32-bit of the result matter. 在此博客文章中 ,我提供了一个示例,其中当源代码中的整数类型为64位但仅结果的最低32位时,GCC和Clang可能会产生更短的32位指令。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.