Java Floating point Integer arithmetic

Question

Given the code below the outputted booleans are

A: false
B: false
C: true

When I try to subtract the sum of V1 + V2 by anything less than 65 doesn't work, it's as if the subtraction never occurs. If I switch the primitive to double the issue is fixed. Why is this happening?

private static final float V1 = 1076712940;
private static final float V2 = 1070770707;

public static void main(final String[] args) {
    final float y = V1 + V2;//2147483647

    System.out.println("A: ((y - 64) - 1) == (y - 65) --> "
                        + (((y - 64) - 1) == (y - 65))); /* A */
    System.out.println("B: (y > y - 64) --> " + (y > y - 64)); /* B */
    System.out.println("C: (y > y - 65) --> " + (y > y - 65)); /* C */
}

Answer 1

Floats are represented in memory by a sign bit, a mantissa, and an exponent, and they are essentially binary fractions. Many decimal numbers cannot be represented exactly as binary fractions. Having additional bits in the exponent or mantissa would allow for a greater level of precision.

I believe that in your case, the difference between the float you have saved and a value that is 64 less is not a significant enough change to have a different float representation. However, because a double has additional bits in the mantissa and exponent, it can be more precise, and represent the subtraction accurately.

Answer 2

This expands a little on the prior answer .

The original program contains, in a comment, a claim that y is 2147483647. It isn't. Due to floating point rounding, it is 2147483648. This program looks at y , and the representable numbers bracketing it. BigDecimal's toString does exact conversion without scientific notation, clearer for this case. BigDecimal also does exact arithmetic for numbers with finite length decimal expansions, including all finite float and double numbers.

import java.math.BigDecimal;

public class Test {
  private static final float V1 = 1076712940;
  private static final float V2 = 1070770707;

  public static void main(String[] args) {
    final float y = V1 + V2;// 2147483647
    BigDecimal yBD = new BigDecimal(y);
    System.out.println("y = " + yBD);
    BigDecimal down = new BigDecimal(Math.nextDown(y));
    System.out.println("nextDown(y) = " + down + " diff = " + yBD.subtract(down));
    BigDecimal up = new BigDecimal(Math.nextUp(y));
    System.out.println("nextUp(y) = " + up + " diff = " + up.subtract(yBD));
    System.out.println(Float.MAX_VALUE + Float.MAX_VALUE);
  }

}

Output:

y = 2147483648
nextDown(y) = 2147483520 diff = 128
nextUp(y) = 2147483904 diff = 256
Infinity

2147483648 is a power of two, so the gap below it is only 128, but the gap above is 256. Subtracting anything less than 64 has an exact result closer to 2147483648 than to any other representable number. Subtracting 64 gives an exact result half way between two numbers, and round-to-even rounds towards 2147483648. Subtracting 65 has an exact result that is closer 2147483520.

In a comment, you ask: "I switched both V1 and V2 to be Float.MAX_VALUE The results changed A: true B: false C: false What are your thoughts on this?"

My first thought, confirmed by the last output from my program, is "That makes y infinite.". Adding or subtracting a finite number from an infinity does not change its value. An infinity is equal to itself.

In general, it would be easier to see what is going on if you looked directly at the numbers involved, rather than only looking at results of tests and comparisons involving them.

Java Floating point Integer arithmetic

Question

2 answers

solution1
9 ACCPTED 2016-02-13 19:42:55

solution2
1 2016-02-14 13:13:05

Java Floating point Integer arithmetic

Question

2 answers

solution1 9 ACCPTED 2016-02-13 19:42:55

solution2 1 2016-02-14 13:13:05

solution1
9 ACCPTED 2016-02-13 19:42:55

solution2
1 2016-02-14 13:13:05