简体   繁体   中英

Doubts about gcc O3 optimisation flag

I have g++ 4.7.3 compiler. I'm trying to follow the optimisation flags description http://gcc.gnu.org/onlinedocs/gcc-4.7.3/gcc/Optimize-Options.html and have a next problem:

I have a program, which gives different times with -O2 and -O3 flag. -O2 is twice faster than -O3. Time is 8ms with O2 and 16ms with O3.

So I would like to understand what exactly makes difference. In the link above I see:

"O3 Optimize yet more. -O3 turns on all optimizations specified by -O2 and also turns on the -finline-functions, -funswitch-loops, -fpredictive-commoning, -fgcse-after-reload, -ftree-vectorize and -fipa-cp-clone options."

So I simply take -O2 and add all described flags:

-O2 -finline-functions -funswitch-loops -fpredictive-commoning -fgcse-after-reload -ftree-vectorize -fipa-cp-clone

And time is 30ms. But this set of options should be equivalent to -O3. Why time is different? Where do I do something wrong?

PS All results are perfectly reproducible with precision of 1ms.


I have checked the options using

g++ -c -Q -Ox --help=optimizers

and saw that O3 has one more additional option: -ftree-loop-distribute-patterns. But when I add it the the options set:

-O2 -finline-functions -funswitch-loops -fpredictive-commoning -fgcse-after-reload -ftree-vectorize -fipa-cp-clone -ftree-loop-distribute-patterns

the speed is still 30ms.

You can get g++ to show you what options is active with the -Q option:

g++ -c -Q -O3 --help=optimizers

The output is something like:

  -O<number>
  -Ofast
  -Os
  -falign-functions                     [enabled]
  -falign-jumps                         [enabled]
  -falign-labels                        [enabled]
  -falign-loops                         [enabled]
  -fasynchronous-unwind-tables          [enabled]
  -fbranch-count-reg                    [enabled]
  -fbranch-probabilities                [disabled]
  -fbranch-target-load-optimize         [disabled]
  -fbranch-target-load-optimize2        [disabled]
  -fbtr-bb-exclusive                    [disabled]
  -fcaller-saves                        [enabled]
  -fcombine-stack-adjustments           [enabled]
  -fcommon                              [enabled]
  -fcompare-elim                        [enabled]
  -fconserve-stack                      [disabled]
  -fcprop-registers                     [enabled]
  -fcrossjumping                        [enabled]
  -fcse-follow-jumps                    [enabled]
  -fcx-fortran-rules                    [disabled]
  -fcx-limited-range                    [disabled]
  -fdata-sections                       [disabled]
  -fdce                                 [enabled]
ETC..

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM