简体   繁体   English

从i386移动到x86_64时的浮点精度

[英]Floating-point precision when moving from i386 to x86_64

I have an application that was developed for Linux x86 32 bits. 我有一个为Linux x86 32位开发的应用程序。 There are lots of floating-point operations and a lot of tests depending on the results. 根据结果​​,有许多浮点运算和大量测试。 Now we are porting it to x86_64, but the test results are different in this architecture. 现在我们将它移植到x86_64,但测试结果在这个架构中是不同的。 We don't want to keep a separate set of results for each architecture. 我们不希望为每个体系结构保留单独的结果集。

According to the article An Introduction to GCC - for the GNU compilers gcc and g++ the problem is that GCC in X86_64 assumes fpmath=sse while x86 assumes fpmath=387 . 根据GCC简介 - 对于GNU编译器gcc和g ++ ,问题是X86_64中的GCC假定fpmath = sse而x86假设fpmath = 387 The 387 FPU uses 80 bit internal precision for all operations and only convert the result to a given floating-point type (float, double or long double) while SSE uses the type of the operands to determine its internal precision. 387 FPU对所有操作使用80位内部精度 ,仅将结果转换为给定的浮点类型(float,double或long double),而SSE使用操作数的类型来确定其内部精度。

I can force -mfpmath=387 when compiling my own code and all my operations work correctly, but whenever I call some library function (sin, cos, atan2, etc.) the results are wrong again. 我可以在编译自己的代码时强制-mfpmath = 387并且我的所有操作都正常工作,但每当我调用一些库函数(sin,cos,atan2等)时,结果都会再次出错。 I assume it's because libm was compiled without the fpmath override. 我认为这是因为libm是在没有fpmath覆盖的情况下编译的。

I tried to build libm myself (glibc) using 387 emulation, but it caused a lot of crashes all around (don't know if I did something wrong). 我尝试使用387仿真自己构建libm(glibc),但它导致了很多崩溃(不知道我做错了什么)。

Is there a way to force all code in a process to use the 387 emulation in x86_64? 有没有办法强制进程中的所有代码在x86_64中使用387仿真? Or maybe some library that returns the same values as libm does on both architectures? 或者也许某些库在两种体系结构上都返回与libm相同的值? Any suggestions? 有什么建议么?

Regarding the question of "Do you need the 80 bit precision", I have to say that this is not a problem for an individual operation. 关于“你需要80位精度”的问题,我不得不说这不是个别操作的问题。 In this simple case the difference is really small and makes no difference. 在这个简单的情况下,差异非常小,没有区别。 When compounding a lot of operations, though, the error propagates and the difference in the final result is not so small any more and makes a difference. 但是,当复合很多操作时,错误会传播,并且最终结果的差异不再那么小,并且会产生影响。 So I guess I need the 80 bit precision. 所以我想我需要80位精度。

I'd say you need to fix your tests. 我说你需要修理你的测试。 You're generally setting yourself up for disappointment if you assume floating point math to be accurate. 如果你假设浮点数学是准确的,你通常会让自己失望。 Instead of testing for exact equality, test whether it's close enough to the expected result. 而不是测试确切的相等性,测试它是否足够接近预期的结果。 What you've found isn't a bug, after all, so if your tests report errors, the tests are wrong. 毕竟,你发现的并不是一个错误,所以如果你的测试报告错误, 测试是错误的。 ;) ;)

As you've found out, every library you rely on is going to assume SSE floating point, so unless you plan to compile everything manually, now and forever, just so you can set the FP mode to x87, you're better off dealing with the problem now, and just accepting that FP math is not 100% accurate, and will not in general yield the same result on two different platforms. 正如您所知,您依赖的每个库都将采用SSE浮点,因此除非您计划手动编译所有内容 ,现在和永久,只需将FP模式设置为x87,您最好不要处理现在问题,只是接受FP数学不是100%准确,并且通常不会在两个不同的平台上产生相同的结果。 (I believe AMD CPU's yield slightly different results in x87 math as well). (我相信AMD CPU的产量也会因x87数学而略有不同)。

Do you absolutely need 80-bit precision? 你绝对需要 80位精度吗? (If so, there obviously aren't many alternatives, other than to compile everything yourself to use 80-bit FP.) (如果是这样,除了自己编译所有内容以使用80位FP之外,显然没有太多替代方案。)

Otherwise, adjust your tests to perform comparisons and equality tests within some small epsilon. 否则,调整测试以在一些小epsilon中执行比较和相等测试。 If the difference is smaller than that epsilon, the values are considered equal. 如果差值小于该值,则认为这些值相等。

80 bit precision is actually dangerous. 80位精度实际上是危险的。 The problem is that it is actually preserved as long as the variable is stored in the CPU register. 问题是只要变量存储在CPU寄存器中,它就会被保留。 Whenever it is forced out to RAM, it is truncated to the type precision. 每当它被强制输出到RAM时,它都被截断为类型精度。 So you can have a variable actually change its value even though nothing happened to it in the code. 因此,即使代码中没有发生任何变化,您也可以让变量实际更改其值。

If you want long double precision, use long double for all of your floating point variables, rather than expecting float or double to have extra magic precision. 如果你想要long double精度,对所有浮点变量使用long double ,而不是期望floatdouble有额外的魔法精度。 This is really a no-brainer. 这真是一个明智的选择。

SSE浮点和387浮点使用完全不同的指令,因此没有办法说服SSE fp指令使用387.可能最好的处理方法是让你的测试套件重新获得略有不同的结果,而不是依赖于结果与最后一位相同。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 C sscanf i386 与 x86_64 解析引用字符串的不同行为 - C different behavior of sscanf i386 vs x86_64 parsing quoted string timer_create在i386系统上导致分段错误,但在x86_64系统上未引起分段错误(linux) - timer_create causing segmentation fault on i386 system, but not x86_64 system(linux) System V中x86_64的C中浮点值的调用约定是什么? - What is the calling convention for floating-point values in C for x86_64 in System V? CMake; 386:x86-64 输入文件架构 (..) 与 i386 不兼容 output - CMake; 386:x86-64 architecture of input file (.. ) is incompatible with i386 output 链接问题:i386:x86-64输入文件架构* .o与i386输出不兼容 - linking problem: i386:x86-64 architecture of input file *.o is incompatible with i386 output GCC:输入文件“ ../window.ui.o”的i386体系结构与i386:x86-64输出不兼容 - GCC: i386 architecture of input file `../window.ui.o' is incompatible with i386:x86-64 output 文件是为i386构建的,而不是在Mac OSX 10.6上为iOS 4.2编译OpenCV2.2时所链接的架构(x86_64) - file was built for i386 which is not the architecture being linked (x86_64) while compiling OpenCV2.2 for iOS 4.2 on Mac OSX 10.6 c ) 出错和链接问题:输入文件的 i386:x86-64 架构,与 i386 不兼容 output - c )make error& link problem: i386:x86-64 architecture of input file, incompatible with i386 output 尽管存在浮点精度错误,但如何从删除小数点的意义上从浮点数生成整数? - How would I produce an integer from a float in the sense of removing the decimal point, despite floating-point precision errors? 在 x86 CPU 上使用半精度浮点 - Using Half Precision Floating Point on x86 CPUs
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM