For gcc, the manual explains what -O3
, -Os
, etc. translate to in terms of specific optimisation arguments ( -funswitch-loops
, -fcompare-elim
, etc.)
I'm looking for the same info for clang .
I've looked online and in man clang
which only gives general information ( -O2
optimises more aggressively than -O1
, -Os
optimises for size, ...) and also looked here on Stack Overflow and found this , but I haven't found anything relevant in the cited source files.
Edit: I found an answer but I'm still interested if anyone has a link to a user-manual documenting all optimisation passes and the passes selected by -O x
. Currently I just found this list of passes, but nothing on optimisation levels.
I found this related question.
To sum it up, to find out about compiler optimization passes:
llvm-as < /dev/null | opt -O3 -disable-output -debug-pass=Arguments
As pointed out in Geoff Nixon 's answer (+1), clang
additionally runs some higher level optimizations, which we can retrieve with:
echo 'int;' | clang -xc -O3 - -o /dev/null -\#\#\#
Documentation of individual passes is available here .
You can compare the effect of changing high-level flags such as -O
like this:
diff -wy --suppress-common-lines \
<(echo 'int;' | clang -xc - -o /dev/null -\#\#\# 2>&1 | tr " " "\n" | grep -v /tmp) \
<(echo 'int;' | clang -xc -O0 - -o /dev/null -\#\#\# 2>&1 | tr " " "\n" | grep -v /tmp)
# will tell you that -O0 is indeed the default.
With version 6.0 the passes are as follow:
baseline ( -O0
):
opt
sets : -tti -verify -ee-instrument -targetlibinfo -assumption-cache-tracker -profile-summary-info -forceattrs -basiccg -always-inline -barrier
clang
adds : -mdisable-fp-elim -mrelax-all
-O1
is based on -O0
opt
adds : -targetlibinfo -tti -tbaa -scoped-noalias -assumption-cache-tracker -profile-summary-info -forceattrs -inferattrs -ipsccp -called-value-propagation -globalopt -domtree -mem2reg -deadargelim -basicaa -aa -loops -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -simplifycfg -basiccg -globals-aa -prune-eh -always-inline -functionattrs -sroa -memoryssa -early-cse-memssa -speculative-execution -lazy-value-info -jump-threading -correlated-propagation -libcalls-shrinkwrap -branch-prob -block-freq -pgo-memop-opt -tailcallelim -reassociate -loop-simplify -lcssa-verification -lcssa -scalar-evolution -loop-rotate -licm -loop-unswitch -indvars -loop-idiom -loop-deletion -loop-unroll -memdep -memcpyopt -sccp -demanded-bits -bdce -dse -postdomtree -adce -barrier -rpo-functionattrs -globaldce -float2int -loop-accesses -loop-distribute -loop-vectorize -loop-load-elim -alignment-from-assumptions -strip-dead-prototypes -loop-sink -instsimplify -div-rem-pairs -verify -ee-inst rument -early-cse -lower-expect
clang
adds : -momit-leaf-frame-pointer
clang
drops : -mdisable-fp-elim -mrelax-all
-O2
is based on -O1
opt
adds : -inline -mldst-motion -gvn -elim-avail-extern -slp-vectorizer -constmerge
opt
drops : -always-inline
clang
adds : -vectorize-loops -vectorize-slp
-O3
is based on -O2
opt
adds : -callsite-splitting -argpromotion
-Ofast
is based on -O3
, valid in clang
but not in opt
clang
adds : -fno-signed-zeros -freciprocal-math -ffp-contract=fast -menable-unsafe-fp-math -menable-no-nans -menable-no-infs -mreassociate -fno-trapping-math -ffast-math -ffinite-math-only
-Os
is similar to -O2
opt
drops : -libcalls-shrinkwrap and -pgo-memopt-opt
-Oz
is based on -Os
opt
drops : -slp-vectorizer
With version 3.8 the passes are as follow:
baseline ( -O0
):
opt
sets : -targetlibinfo -tti -verify
clang
adds : -mdisable-fp-elim -mrelax-all
-O1
is based on -O0
opt
adds : -globalopt -demanded-bits -branch-prob -inferattrs -ipsccp -dse -loop-simplify -scoped-noalias -barrier -adce -deadargelim -memdep -licm -globals-aa -rpo-functionattrs -basiccg -loop-idiom -forceattrs -mem2reg -simplifycfg -early-cse -instcombine -sccp -loop-unswitch -loop-vectorize -tailcallelim -functionattrs -loop-accesses -memcpyopt -loop-deletion -reassociate -strip-dead-prototypes -loops -basicaa -correlated-propagation -lcssa -domtree -always-inline -aa -block-freq -float2int -lower-expect -sroa -loop-unroll -alignment-from-assumptions -lazy-value-info -prune-eh -jump-threading -loop-rotate -indvars -bdce -scalar-evolution -tbaa -assumption-cache-tracker
clang
adds : -momit-leaf-frame-pointer
clang
drops : -mdisable-fp-elim -mrelax-all
-O2
is based on -O1
opt
adds : -elim-avail-extern -mldst-motion -slp-vectorizer -gvn -inline -globaldce -constmerge
opt
drops : -always-inline
clang
adds : -vectorize-loops -vectorize-slp
-O3
is based on -O2
opt
adds : -argpromotion
-Ofast
is based on -O3
, valid in clang
but not in opt
clang
adds : -fno-signed-zeros -freciprocal-math -ffp-contract=fast -menable-unsafe-fp-math -menable-no-nans -menable-no-infs
-Os
is the same as -O2
-Oz
is based on -Os
opt
drops : -slp-vectorizer
clang
drops : -vectorize-loops
With version 3.7 the passes are as follow (parsed output of the command above):
default (-O0): -targetlibinfo -verify -tti
-O1 is based on -O0
adds : -sccp -loop-simplify -float2int -lazy-value-info -correlated-propagation -bdce -lcssa -deadargelim -loop-unroll -loop-vectorize -barrier -memcpyopt -loop-accesses -assumption-cache-tracker -reassociate -loop-deletion -branch-prob -jump-threading -domtree -dse -loop-rotate -ipsccp -instcombine -scoped-noalias -licm -prune-eh -loop-unswitch -alignment-from-assumptions -early-cse -inline-cost -simplifycfg -strip-dead-prototypes -tbaa -sroa -no-aa -adce -functionattrs -lower-expect -basiccg -loops -loop-idiom -tailcallelim -basicaa -indvars -globalopt -block-freq -scalar-evolution -memdep -always-inline
-O2 is based on -01
adds : -elim-avail-extern -globaldce -inline -constmerge -mldst-motion -gvn -slp-vectorizer
removes : -always-inline
-O3 is based on -O2
adds : -argpromotion -verif
-Os is identical to -O2
-Oz is based on -Os
removes : -slp-vectorizer
For version 3.6 the passes are as documented in GYUNGMIN KIM's post.
With version 3.5 the passes are as follow (parsed output of the command above):
default (-O0): -targetlibinfo -verify -verify-di
-O1 is based on -O0
adds : -correlated-propagation -basiccg -simplifycfg -no-aa -jump-threading -sroa -loop-unswitch -ipsccp -instcombine -memdep -memcpyopt -barrier -block-freq -loop-simplify -loop-vectorize -inline-cost -branch-prob -early-cse -lazy-value-info -loop-rotate -strip-dead-prototypes -loop-deletion -tbaa -prune-eh -indvars -loop-unroll -reassociate -loops -sccp -always-inline -basicaa -dse -globalopt -tailcallelim -functionattrs -deadargelim -notti -scalar-evolution -lower-expect -licm -loop-idiom -adce -domtree -lcssa
-O2 is based on -01
adds : -gvn -constmerge -globaldce -slp-vectorizer -mldst-motion -inline
removes : -always-inline
-O3 is based on -O2
adds : -argpromotion
-Os is identical to -O2
-Oz is based on -Os
removes : -slp-vectorizer
With version 3.4 the passes are as follow (parsed output of the command above):
-O0: -targetlibinfo -preverify -domtree -verify
-O1 is based on -O0
adds : -adce -always-inline -basicaa -basiccg -correlated-propagation -deadargelim -dse -early-cse -functionattrs -globalopt -indvars -inline-cost -instcombine -ipsccp -jump-threading -lazy-value-info -lcssa -licm -loop-deletion -loop-idiom -loop-rotate -loop-simplify -loop-unroll -loop-unswitch -loops -lower-expect -memcpyopt -memdep -no-aa -notti -prune-eh -reassociate -scalar-evolution -sccp -simplifycfg -sroa -strip-dead-prototypes -tailcallelim -tbaa
-O2 is based on -01
adds : -barrier -constmerge -domtree -globaldce -gvn -inline -loop-vectorize -preverify -slp-vectorizer -targetlibinfo -verify
removes : -always-inline
-O3 is based on -O2
adds : -argpromotion
-Os is identical to -O2
-Oz is based on -O2
removes : -barrier -loop-vectorize -slp-vectorizer
With version 3.2 the passes are as follow (parsed output of the command above):
-O0: -targetlibinfo -preverify -domtree -verify
-O1 is based on -O0
adds : -sroa -early-cse -lower-expect -no-aa -tbaa -basicaa -globalopt -ipsccp -deadargelim -instcombine -simplifycfg -basiccg -prune-eh -always-inline -functionattrs -simplify-libcalls -lazy-value-info -jump-threading -correlated-propagation -tailcallelim -reassociate -loops -loop-simplify -lcssa -loop-rotate -licm -loop-unswitch -scalar-evolution -indvars -loop-idiom -loop-deletion -loop-unroll -memdep -memcpyopt -sccp -dse -adce -strip-dead-prototypes
-O2 is based on -01
adds : -inline -globaldce -constmerge
removes : -always-inline
-O3 is based on -O2
adds : -argpromotion
-Os is identical to -O2
-Oz is identical to -Os
Edit [march 2014] removed duplicates from lists.
Edit [april 2014] added documentation link + options for 3.4
Edit [september 2014] added options for 3.5
Edit [december 2015] added options for 3.7 and mention existing answer for 3.6
Edit [may 2016] added options for 3.8, for both opt and clang and mention existing answer for clang (versus opt)
Edit [nov 2018] add options for 6.0
@Antoine's answer (and the other question linked) accurately describe the LLVM optimizations that are enabled, but there are a few other Clang-specific options (ie, those that affect lowering to the AST) that affected by the -O[0|1|2|3|fast]
flags.
You can take a look at these with:
echo 'int;' | clang -xc -O0 - -o /dev/null -\\#\\#\\#
echo 'int;' | clang -xc -O1 - -o /dev/null -\\#\\#\\#
echo 'int;' | clang -xc -O2 - -o /dev/null -\\#\\#\\#
echo 'int;' | clang -xc -O3 - -o /dev/null -\\#\\#\\#
echo 'int;' | clang -xc -Ofast - -o /dev/null -\\#\\#\\#
For example, -O0
enables -mrelax-all
, -O1
enables -vectorize-loops
and -vectorize-slp
, and -Ofast
enables -menable-no-infs
, -menable-no-nans
, -menable-unsafe-fp-math
, -ffp-contract=fast
and -ffast-math
.
@Techogrebo:
Yes, no don't necessarily need the other LLVM tools. Try:
echo 'int;' | clang -xc - -o /dev/null -mllvm -print-all-options
Also, there are a lot more detailed options you can examine/modify with Clang alone... you just need to know how to get to them!
Try a few of:
clang -help
clang -cc1 -help
clang -cc1 -mllvm -help
clang -cc1 -mllvm -help-list-hidden
clang -cc1as -help
Starting with clang / LLVM 13.0.0, the legacy pass manager has been deprecated and the new pass manager is used by default. This means that the previous solution for printing the optimization passes used for the different optimization levels in opt
will only work if the legacy pass manager is explicitly enabled with -enable-new-pm=0
. So as long as the legacy pass manager is around (expected until LLVM 14), one can use the following command
llvm-as < /dev/null | opt -O3 -disable-output -debug-pass=Arguments -enable-new-pm=0
Alternatively, the execution order of the optimization passes with the new pass manager can be extracted with --debug-pass-manager
(instead of -debug-pass=Arguments
). Unfortunately the output is very verbose and some processing needs to be done to reconstruct the behavior manually with -passes=
. If only transformation passes are of interest, one can use the option -debug-pass-manager=quiet
to skip information about analyses.
There is a user guide on how to use the new pass manager with opt
on the LLVM Website .
Pass Arguments: -targetlibinfo -no-aa -tbaa -scoped-noalias -assumption-cache-tracker -basicaa -notti -verify-di -ipsccp -globalopt -deadargelim -domtree -instcombine -simplifycfg -basiccg -prune-eh -inline-cost -always-inline -functionattrs -sroa -domtree -early-cse -lazy-value-info -jump-threading -correlated-propagation -simplifycfg -domtree -instcombine -tailcallelim -simplifycfg -reassociate -domtree -loops -loop-simplify -lcssa -loop-rotate -licm -loop-unswitch -instcombine -scalar-evolution -loop-simplify -lcssa -indvars -loop-idiom -loop-deletion -function_tti -loop-unroll -memdep -memcpyopt -sccp -domtree -instcombine -lazy-value-info -jump-threading -correlated-propagation -domtree -memdep -dse -adce -simplifycfg -domtree -instcombine -barrier -domtree -loops -loop-simplify -lcssa -branch-prob -block-freq -scalar-evolution -loop-vectorize -instcombine -simplifycfg -domtree -instcombine -loops -loop-simplify -lcssa -scalar-evolution -function_tti -loop-unroll -alignment-from-a ssumptions -strip-dead-prototypes -verify -verify-di
add : -inline -mldst-motion -domtree -memdep -gvn -memdep -scalar-evolution -slp-vectorizer -globaldce -constmerge
and removes: -always-inline
add: -argpromotion
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.