简体   繁体   中英

With the -Ofast flag on gcc, does breaking down a math expression affect speed?

i want to know whether, with the -Ofast flag on gcc, the code

x += (a * b) + (c * d) + (e * f);

is faster/slower/the same as/than this code:

x += a * b;
x += b * c;  
x += e * f;

I have a math expression like this inside of a nested loop so any gain in speed might have a significant effect.

Intuitively, I'd expect these to compile to the same code. But let's see what actually happens! Using godbolt with your first version (the one-liner), we get this code:

    mov     eax, DWORD PTR [rsp+20]
    mov     esi, DWORD PTR [rsp+28]
    imul    esi, DWORD PTR [rsp+32]
    imul    eax, DWORD PTR [rsp+24]
    lea     eax, [rax+rsi]
    mov     esi, DWORD PTR [rsp+36]
    imul    esi, DWORD PTR [rsp+40]
    add     esi, eax
    add     esi, DWORD PTR [rsp+44]
    mov     DWORD PTR [rsp+44], esi

With the second version , we get this:

    mov     esi, DWORD PTR [rsp+28]
    imul    esi, DWORD PTR [rsp+32]
    mov     eax, DWORD PTR [rsp+20]
    imul    eax, DWORD PTR [rsp+24]
    add     eax, DWORD PTR [rsp+44]
    lea     eax, [rax+rsi]
    mov     esi, DWORD PTR [rsp+36]
    imul    esi, DWORD PTR [rsp+40]
    add     esi, eax
    mov     DWORD PTR [rsp+44], esi]

These are, I believe, the same instructions in a slightly different order. I suspect the performance would be almost identical in these two cases, though perhaps (?) there would be a slight difference in pipeline performance with one versus the other.

I suspect that your first version is perfectly fine here.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM