简体繁体 English

卡汉求和

[英]Kahan summation

原文 2011-02-09 00:15:20 9 4 algorithm/ floating-point

Has anyone used Kahan summation in an application?有人在应用程序中使用过 Kahan 求和吗？ When would the extra precision be useful?额外的精度什么时候有用？

I hear that on some platforms double operations are quicker than float operations.我听说在某些平台上，双重操作比浮动操作更快。 How can I test this on my machine?我怎样才能在我的机器上测试这个？

4 个解决方案

Kahan summation works well when you are summing numbers and you need to minimize the worse-case floating point error.当您对数字求和并且需要最小化最坏情况的浮点错误时， Kahan 求和效果很好。 Without this technique, you may have significant loss of precision in add operations if you have two numbers that differ in magnitude by the significant digits available (eg 1 + 1e-12).如果没有这种技术，如果您有两个数的大小相差可用有效数字（例如 1 + 1e-12），则加法运算的精度可能会显着降低。 Kahan summation compensates for this. Kahan 求和补偿了这一点。

And an excellent resource for floating point issues is here, "What every computer scientist should know about floating-point arithmetic": http://www.validlab.com/goldberg/paper.pdf这里有一个关于浮点问题的极好的资源，“每个计算机科学家都应该知道的浮点运算知识”： http ://www.validlab.com/goldberg/paper.pdf

On single vs double precision performance: yes, single precision can be significantly faster, but it depends on the particular machine.关于单精度与双精度性能：是的，单精度可以明显更快，但这取决于特定的机器。 See: https://www.hpcwire.com/2006/06/16/less_is_more_exploiting_single_precision_math_in_hpc-1/请参阅： https ://www.hpcwire.com/2006/06/16/less_is_more_exploiting_single_precision_math_in_hpc-1/

The best way to test is to write a short example that tests the operations you care about, using both single (float) and double precision, and measure the runtimes.最好的测试方法是编写一个简短的示例来测试您关心的操作，同时使用单精度（浮点）和双精度，并测量运行时间。

I've use Kahan summation for Monte-Carlo integration.我使用 Kahan 求和进行蒙特卡洛积分。 You have a scalar valued function f which you believe is rather expensive to evaluate;您有一个标量值函数f ，您认为它的计算成本相当高； a reasonable estimate is 65ns/dimension.一个合理的估计是 65ns/维度。 Then you accumulate those values into an average-updating an average takes about 4ns.然后将这些值累加到平均值中——更新平均值大约需要 4ns。 So if you update the average using Kahan summation (4x as many flops, ~16ns) then you're really not adding that much compute to the total.因此，如果您使用 Kahan 求和（4 倍的触发器，~16ns）更新平均值，那么您实际上并没有向总数添加那么多计算。 Now, often it is said that the error of Monte-Carlo integration is σ/√ N , but this is incorrect.现在常说蒙特卡洛积分的误差为σ/√ N ，其实这是不正确的。 The real error bound (in finite precision arithmetic) is实际误差界限（在有限精度算术中）是

σ/√ N + cond( I _n )ε N σ/√ N + cond( I _n )ε N

Where cond( I _n ) is the condition number of summation and ε is twice the unit roundoff.其中 cond( I _n ) 是求和的条件数，ε 是单位舍入的两倍。 So the algorithm diverges faster than it converges .所以算法发散比收敛快。 For 32 bit arithmetic, getting ε N ~ 1 is simple: 10^7 evaluations can be done exceedingly quickly, and after this your Monte-Carlo integration goes on a random walk.对于 32 位算术，获得 ε N ~ 1 很简单：可以非常快速地完成 10^7 次评估，之后您的蒙特卡洛积分将进行随机游走。 The situation is even worse when the condition number is large.当条件数很大时，情况更糟。

If you use Kahan summation, the expression for the error changes to如果使用 Kahan 求和，则错误的表达式变为

σ/√ N + cond( I _n )ε ² N, σ/√ N + cond( I _n )ε ² N,

Which, admittedly still diverges faster than it converges, but ε ² N cannot be made large on a reasonable timescale on modern hardware.诚然，它的发散速度仍然快于收敛速度，但 ε ² N 在现代硬件上无法在合理的时间尺度上变大。

I've used Kahan summation to compensate for an accumulated error when computing running averages.在计算运行平均值时，我使用 Kahan 求和来补偿累积误差。 It does make quite a difference and it's easy to test.它确实有很大的不同，而且很容易测试。 I eliminated rather large errors after only a 100 summations.仅经过 100 次求和，我就消除了相当大的错误。

I would definitely use the Kahan summation algorithm to compensate for the error in any running totals.我肯定会使用 Kahan 求和算法来补偿任何运行总计中的错误。

However, I've noticed quite large ( 1e-3 ) errors when doing inverse matrix multiplication.但是，我注意到在进行逆矩阵乘法时出现了相当大的 ( 1e-3 ) 错误。 Basically, A*x = y , then inv(A)*y ~= x I'm not getting the original values back exactly.基本上， A*x = y ，然后是inv(A)*y ~= x我没有准确地恢复原始值。 Which is fine but I thought maybe Kahan summation would help (there's a lot of addition) especially with larger matrices >3-by-3.这很好，但我认为 Kahan 求和可能会有所帮助（有很多加法），尤其是对于 >3×3 的较大矩阵。 I tried with a 4-by-4 matrix and it did not improve the situation at all.我尝试使用 4×4 矩阵，但它根本没有改善这种情况。

When would the extra precision be useful?额外的精度什么时候有用？

Very roughly:非常粗略：

Case 1情况1

When you are当你在

Summing up a lot of data总结了很多数据
in a non-sequential fashion, ie computing sums, then summing up the sums (as opposed to iterating all data with a running sum),以非顺序方式，即计算总和，然后对总和求和（而不是用运行总和迭代所有数据），

then Kahan summation makes a lot of sense in the second phase - when you sum-up-the-sums, because the errors you're avoiding are by now more significant, while the overhead is paid only for a small fraction of the overall sum operations.然后 Kahan 求和在第二阶段很有意义——当你求和时，因为你避免的错误现在更重要，而开销只支付总和的一小部分操作。

Case 2案例二

When you're working with a lower-precision floating-point type, without being sure you're meeting the accuracy requirement, and you're not allowed to switch to a larger, higher-precision type.当您使用精度较低的浮点类型时，不确定是否满足精度要求，并且不允许切换到更大、精度更高的类型。