浮点精度中double和float之间的差异

Question

After reading this question , and this msdn blog , I have tried few examples to test this: 在阅读了这个问题以及这个msdn博客之后，我尝试了几个例子来测试这个：

Console.WriteLine(0.8-0.7 == 0.1);

And yes, the expected output is False . 是的，预期的输出是False 。 Hence I try cast the expression in both side to double and float to see whether I can get a different result: 因此我尝试将两侧的表达式转换为double并float以查看是否可以得到不同的结果：

Console.WriteLine((float)(0.8-0.7) == (float)(0.1));
Console.WriteLine((double)(0.8-0.7) == (double)(0.1));

The first line output True but the second line output False , why is this happening? 第一行输出True但第二行输出False ，为什么会发生这种情况？

Furthermore, 此外，

Console.WriteLine(8-0.7 == 7.3);
Console.WriteLine(8.0-0.7 == 7.3);

Both of the line above give True even without casting. 即使没有强制转换，上面的两行都给出了True 。 And ... 而......

Console.WriteLine(18.01-0.7 == 17.31);

This line output False . 这行输出False 。 How is subtracting 8 difference from subtracting 18.01 if they both are subtracted by a floating point number? 如果它们都被浮点数减去，如何从减去18.01中减去8差？

I've tried to read through the blog and question, I can't seem to find answer else where. 我试图通过博客和问题阅读，我似乎无法找到答案在哪里。 Can someone please explain to me why are all of these happening in Layman's language? 有人可以向我解释为什么所有这些都发生在Layman的语言中？ Thank you in advance. 先感谢您。

EDIT: 编辑：

Console.WriteLine(8.001-0.001 == 8); //this return false
Console.WriteLine(8.01-0.01 == 8); //this return true

Note: I am using .NET fiddle online c# compiler. 注意：我正在使用.NET小提琴在线c＃编译器。

Answer 1

The Cases of 0.8−0.7 案例0.8-0.7

In 0.8-0.7 == 0.1 , none of the literals are exactly representable in double . 在0.8-0.7 == 0.1 ，没有一个文字在double是完全可表示的。 The nearest representable values are 0.8000000000000000444089209850062616169452667236328125 for .8, 0.6999999999999999555910790149937383830547332763671875 for .7, and 0.1000000000000000055511151231257827021181583404541015625 for .1. 对于.7，最接近的可表示值为0.8000000000000000444089209850062616169452667236328125。对于.7，最接近的可表示值为0.6，对于.7为0.6999999999999999955910799149937383830547332763671875，对于.1为0.1000000000000000055511151231257827021181583404541015625。 When the first two are subtracted, the result is 0.100000000000000088817841970012523233890533447265625. 减去前两个时，结果为0.100000000000000088817841970012523233890533447265625。 As this is not equal to the third, 0.8-0.7 == 0.1 evaluates to false. 由于这不等于第三个， 0.8-0.7 == 0.1评估为假。

In (float)(0.8-0.7) == (float)(0.1) , the result of 0.8-0.7 and 0.1 are each converted to float . 在(float)(0.8-0.7) == (float)(0.1) ， 0.8-0.7和0.1的结果各自转换为float 。 The float value nearest to the former, 0.1000000000000000055511151231257827021181583404541015625, is 0.100000001490116119384765625. 最接近前者的float值0.1000000000000000055511151231257827021181583404541015625为0.100000001490116119384765625。 The float value nearest to the latter, 0.100000000000000088817841970012523233890533447265625, is 0.100000001490116119384765625. 最接近后者的float值0.100000000000000088817841970012523233890533447265625为0.100000001490116119384765625。 Since these are the same, (float)(0.8-0.7) == (float)(0.1) evaluates to true. 由于它们是相同的， (float)(0.8-0.7) == (float)(0.1)计算结果为true。

In (double)(0.8-0.7) == (double)(0.1) , the result of 0.8-0.7 and 0.1 are each converted to double . 在(double)(0.8-0.7) == (double)(0.1) ， 0.8-0.7和0.1的结果各自转换为double 。 Since they are already double , there is no effect, and the result is the same as for 0.8-0.7 == 0.1 . 因为它们已经double ，所以没有效果，结果与0.8-0.7 == 0.1相同。

Notes 笔记

The C# specification, version 5.0 indicates that float and double are the IEEE-754 32-bit and 64-bit floating-point types. C＃规范5.0版表明float和double是IEEE-754 32位和64位浮点类型。 I do not see it explicitly state they are the binary floating-point formats rather than decimal formats, but the characteristics described make this evident. 我没有看到它明确表明它们是二进制浮点格式而不是十进制格式，但所描述的特征使这一点变得明显。 The specification also states that IEEE-754 arithmetic is generally used, with round-to-nearest (presumably round-to-nearest-ties-to-even), subject to the exception below. 该规范还规定，通常使用IEEE-754算法，具有舍入到最接近的（可能是从最接近到最近的连接），但下面的例外情况除外。

The C# specification allows floating-point arithmetic to be performed with more precision than the nominal type. C＃规范允许以比标称类型更精确的方式执行浮点运算。 Clause 4.1.6 says “… Floating-point operations may be performed with higher precision than the result type of the operation…” This can complicate analysis of floating-point expressions in general, but it does not concern us in the instance of 0.8-0.7 == 0.1 because the only applicable operation is the subtraction of 0.7 from 0.8 , and these numbers are in the same binade (have the same power of two in the floating-point representation), so the result of the subtraction is exactly representable and additional precision will not change the result. 第4.1.6条说“......浮点运算的执行精度可能高于运算的结果类型...”这一般会使浮点表达式的分析复杂化，但在0.8-0.7 == 0.1的情况下它并不关心我们0.8-0.7 == 0.1因为唯一适用的操作是从0.8减去0.7 ，并且这些数字在相同的binade中（在浮点表示中具有相同的2的幂），因此减法的结果是完全可表示的额外的精度不会改变结果。 As long as the conversion of the source texts 0.8 , 0.7 , and 0.1 to double does not use extra precision and the cast to float produces a float with no extra precision, the results will be as stated above. 只要源的转化案文0.8 ， 0.7 ，以及0.1至double不使用额外的精确度和铸造到float产生float ，没有额外的精度，其结果将是如上所述。 (The C# standard says in clause 6.2.1 that a conversion from double to float yields a float value, although it does not explicitly state that no extra precision may be used at this point.) （C＃标准在第6.2.1节中说，从double到float的转换会产生一个float值，尽管它没有明确声明此时不能使用额外的精度。）

Additional Cases 其他案件

In 8-0.7 == 7.3 , we have 8 for 8 , 7.29999999999999982236431605997495353221893310546875 for 7.3 , 0.6999999999999999555910790149937383830547332763671875 for 0.7 , and 7.29999999999999982236431605997495353221893310546875 for 8-0.7 , so the result is true. 在8-0.7 == 7.3 ，我们有8对8 ， 7.29999999999999982236431605997495353221893310546875为7.3 ，0.6999999999999999555910790149937383830547332763671875为0.7 ，而对于7.29999999999999982236431605997495353221893310546875 8-0.7 ，所以结果是正确的。

Note that the additional precision allowed by the C# specification could affect the result of 8-0.7 . 请注意，C＃规范允许的额外精度可能会影响8-0.7的结果。 AC# implementation that used extra precision for this operation could produce false for this case, as it would get a different result for 8-0.7 . 对于此操作使用额外精度的AC＃实现可能会因此而产生错误，因为它会为8-0.7获得不同的结果。

In 18.01-0.7 == 17.31 , we have 18.010000000000001563194018672220408916473388671875 for 18.01 , 0.6999999999999999555910790149937383830547332763671875 for 0.7 , 17.309999999999998721023075631819665431976318359375 for 17.31 , and 17.31000000000000227373675443232059478759765625 for 18.01-0.7 , so the result is false. 在18.01-0.7 == 17.31 ，我们有18.010000000000001563194018672220408916473388671875为18.01 ， 0.6999999999999999555910790149937383830547332763671875为0.7 ， 17.309999999999998721023075631819665431976318359375为17.31 ，和17.31000000000000227373675443232059478759765625为18.01-0.7 ，所以结果是假的。

How is subtracting 8 difference from subtracting 18.01 if they both are subtracted by a floating point number? 如果它们都被浮点数减去，如何从减去18.01中减去8差？

18.01 is larger than 8 and requires a greater power of two in its floating-point representation. 18.01大于8，并且在其浮点表示中需要更大的2的幂。 Similarly, the result of 18.01-0.7 is larger than that of 8-0.7 . 类似地，结果18.01-0.7比大8-0.7 。 This means the bits in their significands (the fraction portion of the floating-point representation, which is scaled by the power of two) represent greater values, causing the rounding errors in the floating-point operations to be generally greater. 这意味着它们的有效位中的位（浮点表示的小数部分，由2的幂缩放）表示更大的值，导致浮点运算中的舍入误差通常更大。 In general, a floating-point format has a fixed span—there is a fixed distance from the high bit retained to the low bit retained. 通常，浮点格式具有固定的跨度 - 从保持的高位到保留的低位有固定的距离。 When you change to numbers with more bits on the left (high bits), some bits on the right (low bits) are pushed out, and the results change. 当您更改为左侧更多位（高位）的数字时，右侧的某些位（低位）被推出，结果会发生变化。

浮点精度中double和float之间的差异

问题描述

1 个解决方案

解决方案1
5 已采纳 2019-06-25 11:48:05

The Cases of 0.8−0.7 案例0.8-0.7

Notes 笔记

Additional Cases 其他案件

浮点精度中double和float之间的差异

问题描述

1 个解决方案

解决方案1 5 已采纳 2019-06-25 11:48:05

The Cases of 0.8−0.7 案例0.8-0.7

Notes 笔记

Additional Cases 其他案件

解决方案1
5 已采纳 2019-06-25 11:48:05