简体繁体 English

在GCC中启用严格浮点模式

[英]Enabling strict floating point mode in GCC

原文 2011-09-03 21:01:57 5 3 c/ gcc

我尚未创建程序来查看GCC是否需要通过它，当我这样做时，我想知道如何启用严格的浮点模式，该模式将允许运行和计算机之间的结果可重复。

3 个解决方案

Compiling with -msse2 on an Intel/AMD processor that supports it will get you almost there. 在支持它的Intel / AMD处理器上使用-msse2进行编译-msse2 。 Do not let any library put the FPU in FTZ/DNZ mode, and you will be mostly set (processor bugs notwithstanding). 不要让任何库将FPU置于FTZ / DNZ模式，这将使您大为忙（尽管有处理器错误）。

For other architectures, the answer would be different. 对于其他体系结构，答案将有所不同。 Those achitectures that do not offer any convenient way to get exact IEEE 754 semantics (for instance, pre-SSE2 IA32 CPUs) would require the use of a floating-point emulation library to get the result you want, at a very high performance penalty. 那些没有提供任何方便的方法来获取确切的IEEE 754语义的体系结构（例如，SSE2 IA32之前的CPU）将需要使用浮点仿真库来获得所需的结果，而这对性能会造成非常高的影响。

If your target architecture supports the fmadd (multiplication and addition without intermediate rounding) instruction, make sure your compiler does not use it when you have explicit multiplications and additions in the source code. 如果目标体系结构支持fmadd （无中间舍入的乘法和加法）指令，请确保在源代码中具有显式乘法和加法时编译器不使用它。 GCC is not supposed to do this unless you use the -ffast-math option. 除非您使用-ffast-math选项，否则GCC不应这样做。

You can also use GCC 's option -mpc64 on i386 / ia32 target to force double precision computation even on x87 FPU. 您还可以在i386 / ia32目标上使用GCC的选项-mpc64 ，即使在x87 FPU上也可以强制执行双精度计算。 See GCC manual . 请参阅GCC手册。

You can also modify the x87 FPU behavor at runtime, see Deterministic cross-platform floating point arithmetics and also An Introduction to GCC . 您还可以在运行时修改x87 FPU行为，请参阅确定性跨平台浮点算法和GCC简介。

If you use -ffloat-store and always store intermediate values to variables or apply (explicit) casts to the desired type/precision, you should be at least 90% to your goal, and maybe more. 如果您使用-ffloat-store并且始终将中间值存储在变量中，或者将（显式）强制转换应用到所需的类型/精度，则您的目标应至少达到90％，甚至更多。 I'd welcome comments on whether there are cases this approach still misses. 对于这种方法是否仍有遗漏的情况，我表示欢迎。 Note that I claim this works even without any SSE options. 请注意，即使没有任何SSE选项，我也可以使用。