简体   繁体   English

使用 O3 激活的编译标志是什么

[英]what are the compilation flags that are activated by using O3

we are in the process of changing the intel compiler version from v14 to v18 in our systems and by running the tests, we have noticed that O3 in some cases produces incorrect results whereas the same code runs correctly with O3 and v14.我们正在将系统中的英特尔编译器版本从 v14 更改为 v18,通过运行测试,我们注意到 O3 在某些情况下会产生不正确的结果,而相同的代码在 O3 和 v14 上运行正确。 I was wondering what are the differences in the optimizations between these two versions and how can I get a full list of flags that are getting activated by using O3 in each version.我想知道这两个版本之间的优化有何不同,以及如何获得在每个版本中使用 O3 激活的标志的完整列表。 Thank you all in advance for your help and suggestions.在此先感谢大家的帮助和建议。

The behaviour of -O3 is documented on Intel's website: https://software.intel.com/content/www/us/en/develop/documentation/cpp-compiler-developer-guide-and-reference/top/compiler-reference/compiler-options/compiler-option-details/optimization-options/o.html -O3的行为记录在英特尔网站上: https : //software.intel.com/content/www/us/en/develop/documentation/cpp-compiler-developer-guide-and-reference/top/compiler-reference /compiler-options/compiler-option-details/optimization-options/o.html

O3

  • Performs O2 optimizations and enables more aggressive loop transformations such as Fusion, Block-Unroll-and-Jam, and collapsing IF statements.执行O2优化并启用更积极的循环转换,例如 Fusion、Block-Unroll-and-Jam 和折叠 IF 语句。
  • This option may set other options.此选项可以设置其他选项。 This is determined by the compiler, depending on which operating system and architecture you are using.这由编译器决定,具体取决于您使用的操作系统和体系结构。 The options that are set may change from release to release.设置的选项可能会因版本而异。
  • When O3 is used with options -ax or -x (Linux) or with options /Qax or /Qx (Windows), the compiler performs more aggressive data dependency analysis than for O2 , which may result in longer compilation times.当 O3 与选项-ax-x (Linux) 或选项/Qax/Qx (Windows) 一起使用时,编译器会执行比O2更积极的数据依赖性分析,这可能会导致编译时间更长。
  • The O3 optimizations may not cause higher performance unless loop and memory access transformations take place.除非发生循环和内存访问转换,否则O3优化可能不会导致更高的性能。 The optimizations may slow down code in some cases compared to O2 optimizations.与 O2 优化相比,优化在某些情况下可能会减慢代码速度。
  • The O3 option is recommended for applications that have loops that heavily use floating-point calculations and process large data sets.对于具有大量使用浮点计算和处理大型数据集的循环的应用程序,建议使用O3选项。
  • Many routines in the shared libraries are more highly optimized for Intel® microprocessors than for non-Intel microprocessors.与非英特尔微处理器相比,共享库中的许多例程针对英特尔® 微处理器的优化程度更高。

The bottom of the page lists "Alternate options" which only lists -Od (which disables all optimizations: probably not what you want).页面底部列出了“替代选项”,其中仅列出-Od (禁用所有优化:可能不是您想要的)。

So it looks like -O3 activates optimizations that cannot be represented by using other flags (so -O3 does not have a long-form equivalent version).所以看起来-O3激活了不能用其他标志表示的优化(所以-O3没有长格式的等效版本)。

Looking at Intel's page about the techniques used for high-level optimization , it looks like they cannot be enabled à la carte , so with HLO it's all-or-nothing and is enabled using either O2 or O3 (except that O2 uses a subset of O3 's techniques). 查看英特尔关于用于高级优化的技术的页面,看起来它们不能按菜单启用,因此对于 HLO,它是全有或全无,并且使用O2O3启用(除了O2使用O3的技术)。

Compare that to -Ofast which does have a long-form equivalent: 其与具有长格式等效项的-Ofast进行比较

Ofast

  • It sets compiler options -O3 , -no-prec-div , and -fp-model fast=2 .它设置编译器选项-O3-no-prec-div-fp-model fast=2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM