简体繁体中英

How to determine error in floating-point calculations?

原文 2015-10-03 00:29:31 6 1 math/ binary/ floating-point/ floating-accuracy/ ieee-754

I have the following equation I want to implement in floating-point arithmetic:

Equation: sqrt((ab)^2 + (cd)^2 + (ef)^2)

I am wondering how to determine how the width of the mantissa affects the accuracy of the results? How does this affect the accuracy of the result? I was wondering what the correct mathematical approach to determining this is?

For instance, if I perform the following operations, how will the accuracy be affected as after each step?

Here are the steps:

Step 1 , Perform the following calculations in 32-bit single precision floating point: x=(ab), y=(cd), z=(ef)

Step 2 , Round the three results to have a mantissa of 16 bits (not including the hidden bit),

Step 3 , Perform the following squaring operations: x2 = x^2, y2 = y^2, z2 = z^2

Step 4 , Round x2, y2, and z2 to a mantissa of 10 bits (after the decimal point).

Step 5 , Add the values: w = x2 + y2 = z2

Step 6 , Round the results to 16 bits

Step 7, Take the square root: sqrt(w)

Step 8 , Round to 20 mantissa bits (not including the mantissa).

1 answers

There are various ways of representing the error of a floating point numbers. There is relative error (a * (1 + ε)), the subtly different ULP error (a + ulp(a) * ε), and relative error. Each of them can be used in analysing the error but all have shortcomings. To get sensible results you often have to take take into account what happens precisely inside floating point calculations. I'm afraid that the 'correct mathematical approach' is a lot of work, and instead I'll give you the following.

simplified ULP based analysis

The following analysis is quite crude, but it does give a good 'feel' for how much error you end up with. Just treat these as examples only.

(ab) The operation itself gives you up to a 0.5 ULP error (if rounding RNE). The rounding error of this operation can be small compared to the inputs, but if the inputs are very similar and already contain error, you could be left with nothing but noise!

(a^2) This operation multiplies not only the input, but also the input error. If dealing with relative error, that means at least multiplying errors by the other mantissa. Interestingly there is a little normalisation step in the multiplier, that means that the relative error is halved if the multiplication result crosses a power of two boundary. The worst case is where the inputs multiply just below that, eg having two inputs that are almost sqrt(2). In this case the input error is multiplied to 2*ε*sqrt(2). With an additional final rounding error of 0.5 ULP, the total is an error of ~2 ULP.

adding positive numbers The worst case here is just the input errors added together, plus another rounding error. We're now at 3*2+0.5 = 6.5 ULP.

sqrt The worst case for a sqrt is when the input is close to eg 1.0. The error roughly just get passed through, plus an additional rounding error. We're now at 7 ULP.

intermediate rounding steps It will take a bit more work to plug in your intermediate rounding steps. You can model these as an error related to the number of bits you're rounding off. Eg going from a 23 to a 10 bit mantissa with RNE introduces an additional 2^(13-2) ULP error relative to the 23-bit mantissa , or 0.5 ULP to the new mantissa (you'll have to scale down your other errors if you want to work with that).

I'll leave it to you to count the errors of your detailed example, but as the commenters noted, rounding to a 10-bit mantissa will dominate, and your final result will be accurate to roughly 8 mantissa bits.

How to avoid floating-point too small error in TCL

How to display the encoding of a floating-point value

How to do fixed-point math instead of floating-point?

When can I use floating-point types in Java to do monetary calculations?

Estimate maximum relative error of concatenated floating-point operations

ULP calculation floating-point

Awk Floating-point Imprecision

How do I fix my output for floating-point imprecision?

How to avoid floating-point arithmetic in a MySQL query?

How to refine floating-point division on FMA-capable GPUs?

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question How to avoid floating-point too small error in TCL How to display the encoding of a floating-point value How to do fixed-point math instead of floating-point? When can I use floating-point types in Java to do monetary calculations? Estimate maximum relative error of concatenated floating-point operations ULP calculation floating-point Awk Floating-point Imprecision How do I fix my output for floating-point imprecision? How to avoid floating-point arithmetic in a MySQL query? How to refine floating-point division on FMA-capable GPUs?

Related Tags

How to determine error in floating-point calculations?

Question

1 answers

solution1 1 2016-10-10 10:14:40

solution1
1 2016-10-10 10:14:40