Java浮点澄清

Question

I am reading Java puzzlers by Joshua Bloch. 我正在阅读约书亚布洛赫的Java益智游戏 。 In puzzle 28, I am not able to understand following paragraph- 在谜题28中，我无法理解以下段落 -

This works because the larger a floating-point value, the larger the distance between the value and its successor. 这是有效的，因为浮点值越大，值与其后继值之间的距离越大。 This distribution of floating-point values is a consequence of their representation with a fixed number of significant bits. 这种浮点值的分布是它们用固定数量的有效位表示的结果。 Adding 1 to a floating-point value that is sufficiently large will not change the value, because it doesn't "bridge the gap" to its successor. 将1添加到足够大的浮点值将不会更改该值，因为它不会“缩小”与其后继的间隙。

Why do larger floating point values have larger distances between their values and successors? 为什么较大的浮点值的值与后继值之间的距离较大？
In case of Integer , we add one to get the next Integer , but in case of float , how do we get next float value? 在Integer情况下，我们添加一个来获取下一个Integer ，但是如果是float ，我们如何得到下一个float值？ If I have float value in IEEE-754 format, do I add 1 to the mantissa part to get next float? 如果我有IEEE-754格式的浮点值，我是否在尾数部分添加1以获得下一个浮点数？

Answer 1

Imagine a decimal-based format where you are only allowed to set the first 5 values (ie your mantissa is length 5). 想象一下基于十进制的格式，你只允许设置前5个值（即你的尾数是长度5）。 For small numbers you would be fine : 1.0000, 12.000, 125.00 对于小数字你会没事的：1.0000,12.000,125.00

But for larger numbers you would start having to truncate eg1113500. 但对于较大的数字，您将开始截断eg1113500。 The next representable number would be 1113600 which is 100 larger. 下一个可表示的数字是1113600，即100更大。 Any values in between just can't be represented in this format. 中间的任何值都不能以此格式表示。 If you were reading in a value in this range, you would have to truncate it - find the closest representation that matches, even if it is not exact. 如果您正在读取此范围内的值，则必须截断它 - 找到匹配的最接近的表示，即使它不准确。

The problem gets worse the larger the number is. 数字越大，问题就越严重。 If I reach 34567800000 then the next representable number will be 34567900000 which is a gap of 1000000 or one million. 如果我达到34567800000那么下一个可表示的数字将是34567900000，这是1000000或100万的差距。 In this way, you can see that the difference between representations depends on the size. 通过这种方式，您可以看到表示之间的差异取决于大小。

At the other extreme, for small values 0.0001, the next representable value is 0.0002 so the gap is just 0.0001. 在另一个极端，对于小值0.0001，下一个可表示的值是0.0002，因此差距仅为0.0001。

Floating point values have the same principle, but with a binary encoding (powers of two instead of powers of ten). 浮点值具有相同的原理，但采用二进制编码（2的幂而不是10的幂）。

Answer 2

You can think of floating point as base-2 scientific notation. 您可以将浮点视为基础2科学记数法。 In floating point, you are limited to a fixed number of bits for the mantissa (aka significand ) and for the exponent. 在浮点数中，您被限制为尾数（也就是有效数字 ）和指数的固定位数。 How many depends on whether you are using a float (24 bits) or a double (53 bits). 多少取决于您使用的是float （24位）还是double位数（53位）。

It's a little more familiar to think of base-10 scientific notation. 考虑基数为10的科学记数法会更为熟悉。 Imagine that the mantissa is limited to an integer and is always represented by 3 significant digits. 想象一下，尾数限于一个整数，并始终由3位有效数字表示。 Now consider these two pairs of successive numbers in this representation: 现在考虑这个表示中的这两对连续数字：

100 x 10 ⁰ and 101 x 10 ⁰ (100 and 101) 100 x 10 ⁰和101 x 10 ⁰ （100和101）
100 x 10 ¹ and 101 x 10 ¹ (1000 and 1010) 100 x 10 ¹和101 x 10 ¹ （1000和1010）

Note that the distance (aka difference) between the numbers in the first pair is 1, while with the second pair it is 10. In both pairs, the mantissas differ by 1, which is the smallest difference there can be between integers, but the difference is scaled by the exponent. 注意，第一对中的数字之间的距离（又称差异）是1，而第二对中的距离是10.在两对中，尾数相差1，这是整数之间可以存在的最小差异，但是差异由指数缩放。 That's why larger numbers have bigger steps between them in floating point (your first question). 这就是为什么大数字在浮点数之间有更大的步数（你的第一个问题）。

Regarding the second question, let's look at adding 1 (100 x 10 ^-2 ) to the number 1000 (100 x 10 ¹ ): 关于第二个问题，让我们看看将1（100 x 10 ^-2 ）加到数字1000（100 x 10 ¹ ）：

100 x 10 ¹ + 100 x 10 ^-2 = 1001 x 10 ⁰ 100 x 10 ¹ + 100 x 10 ^-2 = 1001 x 10 ⁰

but we are limited to only three significant digits in the mantissa, so the last number gets normalized (after rounding) to: 但我们仅限于尾数中的三位有效数字，因此最后一个数字被标准化（在舍入后）到：

100 x 10 ¹ 100 x 10 ¹

which leaves us back at 1000. To change a floating point value, you need to add at least half the difference between that number and the next number; 这使我们回到1000.要更改浮点值，您需要添加该数字与下一个数字之间差异的至少一半; this minimum difference varies with the scale of the number. 这个最小差异随着数字的大小而变化。

Exactly the same kind of thing is going on with binary floating point. 二进制浮点正在发生同样的事情。 There are more details (eg, normalization, guard digits, implied radix point, implied bit), which you can read about in the excellent article What Every Computer Scientist Should Know About Floating-Point Arithmetic 有更多细节（例如，归一化，保护数字，隐含小数点，隐含位），您可以在优秀的文章中了解每个计算机科学家应该知道的关于浮点运算的内容

Answer 3

floating point numbers are represented as a combination of mantissa and exponent, where the value of the number is mantissa * 2^(exponent) so if we assume the mantissa is limited to 2 digits (to make things simpler) and you have the number 1.1 * 2^100 , which is very large, the "next" value would be 1.2 * 2^100 . 浮点数表示为尾数和指数的组合，其中数字的值是mantissa * 2^(exponent)所以如果我们假设尾数限制为2位数（为了使事情更简单）并且你有数字1.1 * 2^100 ，非常大，“下一个”值为1.2 * 2^100 。 so if youre doing mixed-scale calculations, 1.1*2^100 + 1 will be rounded back to 1.1*2^100 since there's not enough space in the mantissa to retain the accurate result. 因此，如果您进行混合比例计算， 1.1*2^100 + 1将回归到1.1*2^100因为尾数中没有足够的空间来保留准确的结果。
starting with java 6 you have a utility method Math.nextUp() and Math.nextAfter() that will allow you to "iterate" over all possible double/float values. 从java 6开始，你有一个实用的方法Math.nextUp（）和Math.nextAfter（），它允许你“迭代”所有可能的double / float值。 before that you need to add +1 to the mantissa and possible take care of overflowing to get the next/prev values. 在此之前，您需要在尾数中添加+1，并且可能需要处理溢出以获取next / prev值。

Answer 4

Although it does not explain the why, this sample code shows how to calculate the distance between a float and the next available float and gives an example for a large number. 虽然它没有解释原因，但是此示例代码显示了如何计算浮点数与下一个可用浮点数之间的距离，并给出了一个大数字的示例。 f and g should be Integer.MAX_VALUE apart but they are the same. f和g应该是Integer.MAX_VALUE ，但它们是相同的。 And the next value is h , which is 1099511627776 larger. 而下一个值是h ，即1099511627776更大。

float f = Long.MAX_VALUE;
System.out.println("f = " + new BigDecimal(f));
System.out.println("f bits = " + Float.floatToIntBits(f));
float g = f - Integer.MAX_VALUE;
System.out.println("g = f - Integer.MAX_VALUE = " + new BigDecimal(g));
System.out.println("g bits = " + Float.floatToIntBits(g));
System.out.println("f == g? " + (f == g));
float h = Float.intBitsToFloat(Float.floatToIntBits(f) + 1);
System.out.println("h = " + new BigDecimal(h));
System.out.println("h bits = " + Float.floatToIntBits(h));
System.out.println("h - f = " + new BigDecimal(h).subtract(new BigDecimal(f)));

outputs: 输出：

f = 9223372036854775808
f bits = 1593835520
g = f - Integer.MAX_VALUE = 9223372036854775808
g bits = 1593835520
f == g? true
h = 9223373136366403584
h bits = 1593835521
h - f = 1099511627776

Java浮点澄清

问题描述

4 个解决方案

解决方案1
6 2013-08-07 17:12:40

解决方案2
5 已采纳 2013-08-07 17:19:23

解决方案3
4 2013-08-07 17:11:55

解决方案4
2 2013-08-07 17:14:41

Java浮点澄清

问题描述

4 个解决方案

解决方案1 6 2013-08-07 17:12:40

解决方案2 5 已采纳 2013-08-07 17:19:23

解决方案3 4 2013-08-07 17:11:55

解决方案4 2 2013-08-07 17:14:41

解决方案1
6 2013-08-07 17:12:40

解决方案2
5 已采纳 2013-08-07 17:19:23

解决方案3
4 2013-08-07 17:11:55

解决方案4
2 2013-08-07 17:14:41