简体繁体 English

如何确定浮点计算中的误差？

[英]How to determine error in floating-point calculations?

原文 2015-10-03 00:29:31 2 1 math/ binary/ floating-point/ floating-accuracy/ ieee-754

I have the following equation I want to implement in floating-point arithmetic: 我有以下要在浮点运算中实现的方程式：

Equation: sqrt((ab)^2 + (cd)^2 + (ef)^2) 等式：sqrt（（ab）^ 2 +（cd）^ 2 +（ef）^ 2）

I am wondering how to determine how the width of the mantissa affects the accuracy of the results? 我想知道如何确定尾数的宽度如何影响结果的准确性？ How does this affect the accuracy of the result? 这如何影响结果的准确性？ I was wondering what the correct mathematical approach to determining this is? 我想知道确定这个的正确数学方法是什么？

For instance, if I perform the following operations, how will the accuracy be affected as after each step? 例如，如果我执行以下操作，那么在每个步骤之后，精度将如何受到影响？

Here are the steps: 步骤如下：

Step 1 , Perform the following calculations in 32-bit single precision floating point: x=(ab), y=(cd), z=(ef) 步骤1 ，在32位单精度浮点中执行以下计算：x =（ab），y =（cd），z =（ef）

Step 2 , Round the three results to have a mantissa of 16 bits (not including the hidden bit), 步骤2 ，将三个结果四舍五入为16位尾数（不包括隐藏位），

Step 3 , Perform the following squaring operations: x2 = x^2, y2 = y^2, z2 = z^2 步骤3 ，执行以下平方运算：x2 = x ^ 2，y2 = y ^ 2，z2 = z ^ 2

Step 4 , Round x2, y2, and z2 to a mantissa of 10 bits (after the decimal point). 步骤4 ，将x2，y2和z2舍入为10位尾数（小数点后）。

Step 5 , Add the values: w = x2 + y2 = z2 步骤5 ，将值相加：w = x2 + y2 = z2

Step 6 , Round the results to 16 bits 步骤6 ，将结果舍入为16位

Step 7, Take the square root: sqrt(w) 步骤7，取平方根：sqrt（w）

Step 8 , Round to 20 mantissa bits (not including the mantissa). 步骤8 ，舍入到20个尾数位（不包括尾数）。

1 个解决方案

There are various ways of representing the error of a floating point numbers. 有多种表示浮点数错误的方法。 There is relative error (a * (1 + ε)), the subtly different ULP error (a + ulp(a) * ε), and relative error. 存在相对误差（a *（1 +ε）），略有不同的ULP误差（a + ulp（a）*ε）和相对误差。 Each of them can be used in analysing the error but all have shortcomings. 它们每个都可以用于分析错误，但是都有缺点。 To get sensible results you often have to take take into account what happens precisely inside floating point calculations. 为了获得合理的结果，您通常必须考虑到浮点计算内部恰好发生了什么。 I'm afraid that the 'correct mathematical approach' is a lot of work, and instead I'll give you the following. 恐怕“正确的数学方法”需要大量工作，因此，我将为您提供以下内容。

simplified ULP based analysis 简化的基于ULP的分析

The following analysis is quite crude, but it does give a good 'feel' for how much error you end up with. 以下分析是很粗略的，但是对于您最终会遇到多少错误，它的确给出了很好的“感觉”。 Just treat these as examples only. 仅将这些作为示例。

(ab) The operation itself gives you up to a 0.5 ULP error (if rounding RNE). （ab）运算本身最多会产生0.5 ULP误差（如果四舍五入RNE）。 The rounding error of this operation can be small compared to the inputs, but if the inputs are very similar and already contain error, you could be left with nothing but noise! 与输入相比，此操作的舍入误差可能很小，但是如果输入非常相似并且已经包含误差，则除了噪音之外，您将一无所有！

(a^2) This operation multiplies not only the input, but also the input error. （a ^ 2）此运算不仅会乘以输入，还会乘以输入误差。 If dealing with relative error, that means at least multiplying errors by the other mantissa. 如果处理相对误差，则意味着至少将误差乘以另一个尾数。 Interestingly there is a little normalisation step in the multiplier, that means that the relative error is halved if the multiplication result crosses a power of two boundary. 有趣的是，乘法器中几乎没有标准化步骤，这意味着如果乘法结果越过两个边界的幂，则相对误差将减半。 The worst case is where the inputs multiply just below that, eg having two inputs that are almost sqrt(2). 最坏的情况是输入乘以正好低于该乘积的输入，例如，两个输入几乎都等于sqrt（2）。 In this case the input error is multiplied to 2*ε*sqrt(2). 在这种情况下，输入误差乘以2 *ε* sqrt（2）。 With an additional final rounding error of 0.5 ULP, the total is an error of ~2 ULP. 如果最终最终舍入误差为0.5 ULP，则总误差约为2 ULP。

adding positive numbers The worst case here is just the input errors added together, plus another rounding error. 加正数最糟糕的情况是输入错误加在一起，加上另一个舍入错误。 We're now at 3*2+0.5 = 6.5 ULP. 我们现在处于3 * 2 + 0.5 = 6.5 ULP。

sqrt The worst case for a sqrt is when the input is close to eg 1.0. sqrt sqrt最坏的情况是输入接近于1.0。 The error roughly just get passed through, plus an additional rounding error. 该错误大致会通过，再加上一个舍入错误。 We're now at 7 ULP. 现在是7 ULP。

intermediate rounding steps It will take a bit more work to plug in your intermediate rounding steps. 中间舍入步骤插入中间舍入步骤将需要更多的工作。 You can model these as an error related to the number of bits you're rounding off. 您可以将它们建模为与四舍五入后的位数有关的错误。 Eg going from a 23 to a 10 bit mantissa with RNE introduces an additional 2^(13-2) ULP error relative to the 23-bit mantissa , or 0.5 ULP to the new mantissa (you'll have to scale down your other errors if you want to work with that). 例如，使用RNE从23位尾数变为10位尾数会相对于23位尾数引入额外的2 ^（13-2）ULP错误，或者对于新尾数引入0.5 ULP（您必须缩小其他错误如果您想与此一起工作）。

I'll leave it to you to count the errors of your detailed example, but as the commenters noted, rounding to a 10-bit mantissa will dominate, and your final result will be accurate to roughly 8 mantissa bits. 我将留给您计算详细示例的错误，但是正如注释者所指出的那样，四舍五入到10位尾数将占主导地位，最终结果将精确到大约8位尾数。