简体   繁体   中英

Will a C++ compiler inline a for-loop with a small number of terms?

Suppose I have a class Matrix5x5 (with suitably overloaded index operators) and I write a method trace for calculating the sum of its diagonal elements:

double Matrix5x5::trace(void){
    double t(0.0);
    for(int i(0); i <= 4; ++i){
        t += (*this)[i][i];
    }
    return t;
}

Of course, if I instead wrote:

return (*this)[0][0]+(*this)[1][1]+(*this)[2][2]+(*this)[3][3]+(*this)[4][4];

then I would be sure to avoid the overhead of declaring and incrementing my i variable. But it feels quite stupid to write out all those terms!

Since my loop has a constexpr number of terms that happens to be quite small, would a compiler inline it for me?

If your compiler is clever enough, it can optimize this case with the as-if rule . The C++ compiler might optimize a lot of things that way. But it also might not. The only way to be absolutely sure is to check the code your specific compiler generates. Having said that, it's unlikely this will be a bottleneck in your program. So do whichever version is more readable.

Yes! GCC does it at optimization level -O1 and above, and clang does it at optimization level -O2 and above.

I tested it using this code:

struct Matrix5x5 {
    double values[5][5];
    Matrix5x5() : values() {}

    double trace() {
        double sum = 0.0;
        for(int i = 0; i < 5; i++) {
            sum += values[i][i]; 
        }
        return sum; 
    }
};

double trace_of(Matrix5x5& m) {
    return m.trace(); 
}

And this is the assembly produced by both gcc and clang:

trace_of(Matrix5x5&):
    pxor    xmm0, xmm0
    addsd   xmm0, QWORD PTR [rdi]
    addsd   xmm0, QWORD PTR [rdi+48]
    addsd   xmm0, QWORD PTR [rdi+96]
    addsd   xmm0, QWORD PTR [rdi+144]
    addsd   xmm0, QWORD PTR [rdi+192]
    ret

You can play around with the code, and look at the corresponding assembly here: https://godbolt.org/z/p2uF0E .

If you overload operator[] , then you have to up the optimization level to -O3 , but the compiler will still do it: https://godbolt.org/z/JInIME

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM