简体繁体 English

Fermi Cuda对C的双精度

[英]fermi cuda double precision against C

原文 2010-12-30 19:18:55 2 1 cuda/ rounding

there is a small error between CPU and GPU double precision results, using a fermi GPU. 使用Fermi GPU，CPU和GPU的双精度结果之间会有很小的误差。

eg for a small test set, I get the following absolute error for: (Number 1(CPU) - Number 2(GPU)) = 3E-018. 例如对于一个小的测试集，我得到以下绝对错误：（1号（CPU）-2号（GPU））= 3E-018。

in binary form it is as expected very small… 以二进制形式，它非常小...

NUMBER 1 in binary: NUMBER 1（二进制）：

xxxxxxxxxxxxx111000000010 01 xxxxxxxxxxxxx111000000010 01

vs 与

NUMBER 2 in binary: NUMBER 2（二进制）：

xxxxxxxxxxxx1111000000010 10 xxxxxxxxxxxx1111000000010 10

Although this is a difference of one binary digit, I am keen to eliminate any differences, as the errors addup during my code. 尽管这是一个二进制数的差异，但我还是希望消除任何差异，因为在代码执行过程中会增加错误。

any tips from those familiar with fermi? 那些熟悉费米的人有什么建议吗？ if this is unavoidable can I get C/C++ to mimic the fermi rounding off behaviour? 如果这是不可避免的，我可以让C / C ++模仿费米四舍五入的行为吗？

1 个解决方案

You should take a look at this post . 您应该看一下这篇文章。

Floating point is not associative, so if a compiler chooses to do operations in a different order then you'll get a different result. 浮点不是关联的，因此，如果编译器选择以不同的顺序进行操作，那么您将获得不同的结果。 Two versions of the same compiler can produce differences! 同一编译器的两个版本可能会产生差异！ Different compilers are even more likely to produce differences, and if you're doing work in parallel on the GPU (you are, right?) then you're inherently doing operations in a different order... 不同的编译器甚至更有可能产生差异，如果您在GPU上并行工作（是吧？），那么您本来就是在以不同的顺序进行操作...

Fermi hardware is IEEE754-2008 compliant, which means that in addition to IEEE754 standard rounding it also has the fused multiply-add (FMA) instruction which avoids losing precision between multiplication and addition. Fermi硬件符合IEEE754-2008，这意味着除IEEE754标准舍入外，它还具有融合乘法加法（FMA）指令，避免了乘法和加法之间的精度损失。