简体   繁体   中英

Python multiplication losing floating point precision

I was working with floating point numbers recently and I realized a something I didn't expect about floating point numbers. Here is an example

a = 0.1
print(f"{a:0.20f}")
#'0.10000000000000000555'
b = a * 10
print(f"{b:0.20f}")
#'1.00000000000000000000'

I would expect the last print to output 1.00000000000000005551 (ie, 1 followed by digits 1 through 21 of 0.1 ).

What I am curious about is why the floating point error disappears when multiplying by 10. The normal rules of arithmetic suggests that the floating point error would be propagated, but that isn't actually happening. Why does this take place? Is there a way to avoid it?

The exact real number arithmetic product of 10 and 0.1000000000000000055511151231257827021181583404541015625, the IEEE 754 64-bit binary representation of 0.1, is 1.000000000000000055511151231257827021181583404541015625.

It is not exactly representable. It is bracketed by 1.0 and 1.0000000000000002220446049250313080847263336181640625

It is closer to 1.0, so that is the round-to-nearest result of the multiplication.

I calculated the numbers using a short Java program:

import java.math.BigDecimal;

public strictfp class Test {
    public static void main(String[] args) {
        BigDecimal rawTenth = new BigDecimal(0.1);
        BigDecimal realProduct = rawTenth.multiply(BigDecimal.TEN);
        System.out.println(realProduct);
        System.out.println(new BigDecimal(Math.nextUp(1.0)));
    }
}

Output:

1.0000000000000000555111512312578270211815834045410156250
1.0000000000000002220446049250313080847263336181640625

This answer shows how you can determine that converting 1/10 to floating-point and multiplying by 10 will produce exactly 1 using just a little arithmetic; there is no need to calculate large or precise numbers.

Your Python implementation uses the common IEEE-754 binary64 format. (Python is not strict about which floating-point format implementations should use.) In this format, numbers are represented, in effect, as a sign (+ or −) applied to some 53-bit integer multiplied by some power of two. Because 2 −4 ≤ 1/10 < 2 −3 , the representable number nearest 1/10 is some integer M multiplied by 2 −3−53 . (The −53 scales the 53-bit integer to between ½ and 1, and the −3 scales that to between 2 −4 and 2 −3 .) Let's call that representable number x.

Then we have x = M•2 −56 = 1/10 + e, where e is some rounding error that occurs when we round 1/10 to the nearest representable value. Since we round to the nearest representable value, |e| ≤ ½•2 −56 = 2 −57 .

To find exactly what e is, multiply 1/10 by 2 56 . WolframAlpha tells us it is 7205759403792793+3/5. To get the nearest representable value, we should round up, so M = 7205759403792794 and e = 2/5 • 2 −56 . Although I used WolframAlpha to illustrate this, we do not need M, and we can find e by observing the pattern in powers of two modulo 10: 2 1 →2, 2 2 →4, 2 3 →8, 2 4 →6, 2 5 →2, 2 6 →4, and so the pattern repeats with a cycle of 4, and 56 modulo 4 is 0, so 2 56 modulo 10 has the same remainder as 2 4 , 6, so the fraction is 6/10 = 3/5. We know that should round to the nearest integer, 1, so e = 2/5 • 2 −56 .

So x = M•2 −56 = 1/10 + 2/5•2 −56 .

Now we can figure out the result of computing 10•x with floating-point arithmetic. The result is as if we first compute 10•x with real-number arithmetic and then round to the nearest representable value. In real-number arithmetic, 10•x = 10•(1/10 + 2/5•2 −56 ) = 1 + 10•2/5•2 −56 = 1 + 4•2 −56 = 1 + 2 −54 . The two neighboring representable values are 1 and 1 + 2 −52 , and 1 + 2 −54 is closer to 1 than it is to 1 + 2 −52 . So the result is 1.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM