简体   繁体   English

GNU编译器优化

[英]GNU Compiler optimization

I don't know much about compilers, but know they are complicated and smart enough to optimize your code. 我对编译器知之甚少,但知道它们很复杂,足够聪明,可以优化代码。 Say I had code that looked like this: 假设我的代码看起来像这样:

 string foo = "bar";
 for(int i = 0; i < foo.length(); i++){
     //some code that does not modify the length of foo
 }

Would the GNU compiler be smart enough to realize that the length of foo does not change over the course of this loop and replace the foo.length() call with the proper value? GNU编译器是否足够聪明,可以意识到foo的长度在这个循环的过程中没有改变,并用适当的值替换foo.length()调用? Or would foo.length() be called for every i comparison? 或将foo.length()被调用每一个i比较呢?

Since both Mysticial and Kerrek rightfully suggest peeking at the generated assembly, here's an example: 由于Mysticial和Kerrek都正确地建议在生成的程序集中窥视,这里有一个例子:

#include <string>
using namespace std;

int does_clang_love_me(string foo) {
    int j = 0;
    for (int i = 0; i < foo.length(); i++) {
        j++;
    }
    return j;
}

I saved the above code in test.cpp and compiled it like this: 我在test.cpp中保存了上面的代码并将其编译为:

$ clang++ -o test.o -Os -c test.cpp

The -Os switch tells clang to try to optimize for the smallest code size. -Os开关告诉clang尝试针对最小的代码大小进行优化。 GCC has a corresponding switch you can use. GCC有一个你可以使用的相应开关。 To see the assembly, I hit the resulting object file with otool, as I happen to be using a mac at the moment. 为了查看程序集,我用otool命中了生成的目标文件,因为我此刻碰巧正在使用mac。 Other platforms have similar tools. 其他平台也有类似的工具。

$ otool -tv test.o

test.o:
(__TEXT,__text) section
__Z16does_clang_love_meSs:
0000000000000000    pushq   %rbp
0000000000000001    movq    %rsp,%rbp
0000000000000004    movq    (%rdi),%rax
0000000000000007    movq    0xe8(%rax),%rcx
000000000000000b    xorl    %eax,%eax
000000000000000d    testq   %rcx,%rcx
0000000000000010    je  0x0000001e
0000000000000012    cmpq    $0x01,%rcx
0000000000000016    movl    $0x00000001,%eax
000000000000001b    cmoval  %ecx,%eax
000000000000001e    popq    %rbp
000000000000001f    ret

It's like Mysticial said; 就像Mysticial说的那样; it's just a variable access. 它只是一个可变访问。

The only way to know for sure is to try it and take a look at the assembly. 确切知道的唯一方法是尝试并看看装配。

My guess is that if the call to length() is inlined, then Loop Invariant Code Motion will hoist the internals of length() out of the loop and replace it with a single variable. 我的猜测是,如果对length()的调用是内联的,那么Loop Invariant Code Motion将把length()的内部提升出循环并用单个变量替换它。

As a second thought, this might even be moot. 作为第二个想法,这甚至可能没有实际意义。 The size of a string is probably just a simple field in the string class - which is on the stack. 字符串的大小可能只是string类中的一个简单字段 - 它位于堆栈中。 So just inlining the call to length() will already have the effect of reducing the call to a simple variable access. 因此,只需调用length()就可以减少对简单变量访问的调用。

EDIT : In this latter case, it doesn't even matter whether or not the length of foo is modified inside the loop. 编辑:在后一种情况下,在循环内是否修改了foo的长度甚至都不重要。 Getting the length of a string is already just a variable access. 获取字符串的长度只是一个变量访问。

The compiler has to guarantee that the program behaves as if length() was called in every round. 编译器必须保证程序的行为就像在每一轮中调用length() It can only hoist the call out of the loop if it can prove that there are no side effects and that the result is indeed constant. 如果它可以证明没有副作用并且结果确实是恒定的,它只能将呼叫提升出循环。

What happens in a real example needs to be analyzed case-by-case. 在一个真实的例子中发生的事情需要逐个分析。 Just look at the assembly if you're curious. 如果你好奇的话,看看大会吧。

The typical way to enforce the hoisting is to just do it manually: 强制提升的典型方法是手动执行:

for (unsigned int i = 0, end = s.length(); i != end; ++i)

Perhaps you'd also like to consider the modern for (char & c : s) as an alternative. 也许你也想把现代for (char & c : s)作为另一种选择。

Honestly, I don't know exactly how gcc will optimize this code snippet. 老实说,我不知道gcc将如何优化此代码段。 But moving redundancy code outside the loop is called "Partial redundancy elimination". 但是在循环外部移动冗余代码称为“部分冗余消除”。 Moving foo.length() outside the loop, which is called loop invariant code motion, is one form of partial redundancy elimination. 在循环外移动foo.length(),称为循环不变代码运动,是部分冗余消除的一种形式。 Please have a look at the Dragon Book section 9.5 (I'm also reading this chapter), which elaborates how to solve these kinds problems using data flow analysis. 请看一下龙书第9.5节(我也在阅读本章),它详细阐述了如何使用数据流分析来解决这些问题。 Here is a slide from Standford university: http://suif.stanford.edu/~courses/cs243/lectures/l5.pdf . 这是斯坦福大学的一张幻灯片: http//suif.stanford.edu/~courses/cs243/lectures/l5.pdf Hope these will help. 希望这些会有所帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM