简体   繁体   English

浮动与双

[英]Float vs Double

Is there ever a case where a comparison ( equals() ) between two floating point values would return false if you compare them as DOUBLE but return true if you compare them as FLOAT?如果将两个浮点值作为DOUBLE进行比较,则两个浮点值之间的比较 ( equals() ) 是否会返回false ,但如果将它们作为 FLOAT 进行比较,则返回true

I'm writing some procedure, as part of my group project, to compare two numeric values of any given types.作为我的小组项目的一部分,我正在编写一些程序来比较任何给定类型的两个数值。 There're 4 types I'd have to deal with altogether: double , float , int and long .我一共需要处理 4 种类型: doublefloatintlong So I'd like to group double and float into one function, that is, I'd just cast any float to double and do the comparison.所以我想将doublefloat组合成一个 function,也就是说,我只是将任何float转换为double并进行比较。

Would this lead to any incorrect results?这会导致任何不正确的结果吗?

Thanks.谢谢。

If you're converting doubles to floats and the difference between them is beyond the precision of the float type, you can run into trouble.如果您将双精度数转换为浮点数并且它们之间的差异超出了浮点数类型的精度,那么您可能会遇到麻烦。

For example, say you have the two double values:例如,假设您有两个双精度值:

9.876543210
9.876543211

and that the precision of a float was only six decimal digits.并且浮点数的精度只有六位小数。 That would mean that both float values would be 9.87654 , hence equal, even though the double values themselves are not equal.这意味着两个float值都是9.87654 ,因此相等,即使 double 值本身不相等。

However, if you're talking about floats being cast to doubles, then identical floats should give you identical doubles.但是,如果您谈论的是将浮点数转换为双打,那么相同的浮点数应该会给您相同的双打。 If the floats are different, the extra precision will ensure the doubles are distinct as well.如果浮点数不同,额外的精度将确保双打也不同。

As long as you are not mixing promoted floats and natively calculated doubles in your comparison you should be ok, but take care:只要您在比较中没有混合提升的浮点数和本机计算的双精度数,就应该没问题,但要注意:

Comparing floats (or doubles) for equality is difficult - see this lengthy but excellent discussion .比较浮点数(或双精度数)是否相等很困难- 请参阅这篇冗长但精彩的讨论

Here are some highlights:以下是一些亮点:

  1. You can't use == , because of problems with the limited precision of floating point formats您不能使用== ,因为浮点格式的精度有限

  2. float(0.1) and double(0.1) are different values (0.100000001490116119384765625 and 0.1000000000000000055511151231257827021181583404541015625) respectively. float(0.1) 和 double(0.1) 分别是不同的值(0.100000001490116119384765625 和 0.1000000000000000055511151231257827021181583404541015625)。 In your case, this means that comparing two floats (by converting to double) will probably be ok, but be careful if you want to compare a float with a double.在您的情况下,这意味着比较两个浮点数(通过转换为双精度)可能没问题,但如果您想比较浮点数和双精度数,请小心。

  3. It's common to use an epsilon or small value to make a relative comparison with (floats a and b are considered equal if a - b < epsilon ).通常使用 epsilon 或较小的值进行相对比较(如果a - b < epsilon ,浮点数 a 和 b 被认为相等)。 In C, float.h defines FLT_EPSILON for exactly this purpose.在 C 中,float.h 正是为此目的定义了FLT_EPSILON However, this type of comparison doesn't work where a and b are both very small, or both very large.但是,这种类型的比较在ab都非常小或都非常大的情况下不起作用。

  4. You can address this by using a scaled-relative-to-the-sizes-of-a-and-b epsilon, but this breaks down in some cases (like comparisons to zero).您可以通过使用相对于 a 和 b epsilon 的大小进行缩放来解决此问题,但在某些情况下会失效(例如与零进行比较)。

  5. You can compare the integer representations of the floating point numbers to find out how many representable floats there are between them.您可以比较浮点数的integer 表示,找出它们之间有多少个可表示的浮点数。 This is what Java's Float.equals() does.这就是 Java 的Float.equals()所做的。 This is called the ULP difference, for "Units in Last Place" difference.这称为 ULP 差异,用于“最后位置的单位”差异。 It's generally good, but also breaks down when comparing against zero.它通常很好,但在与零进行比较时也会出现故障。

The article concludes:文章总结道:

Know what you're doing知道你在做什么

There is no silver bullet.没有灵丹妙药。 You have to choose wisely.你必须明智地选择。

  • If you are comparing against zero, then relative epsilons and ULPs based comparisons are usually meaningless.如果您要与零进行比较,那么基于相对 epsilon 和 ULP 的比较通常是没有意义的。 You'll need to use an absolute epsilon, whose value might be some small multiple of FLT_EPSILON and the inputs to your calculation.您需要使用绝对 epsilon,其值可能是 FLT_EPSILON 和计算输入的某个小倍数。 Maybe.可能是。
  • If you are comparing against a non-zero number then relative epsilons or ULPs based comparisons are probably what you want.如果您要与非零数字进行比较,那么您可能需要相对 epsilons 或基于 ULP 的比较。 You'll probably want some small multiple of FLT_EPSILON for your relative epsilon, or some small number of ULPs.您可能需要一些 FLT_EPSILON 的小倍数作为您的相对 epsilon,或者一些少量的 ULP。 An absolute epsilon could be used if you knew exactly what number you were comparing against.如果您确切知道要比较的数字是多少,则可以使用绝对 epsilon。
  • If you are comparing two arbitrary numbers that could be zero or non-zero then you need the kitchen sink.如果您要比较两个可能为零或非零的任意数字,那么您需要厨房水槽。 Good luck and God speed.祝你好运和上帝的速度。

So, to answer your question:所以,回答你的问题:

  • If you are downgrading double s to float s, then you might lose precision, and incorrectly report two different double s as equal (as paxdiablo points out.)如果您将double s 降级为float s,那么您可能会失去精度,并错误地将两个不同的double s 报告为相等(如paxdiablo指出的那样)。
  • If you are upgrading identical float s to double , then the added precision won't be a problem unless you are comparing a float with a double (Say you'd got 1.234 in float, and you only had 4 decimal digits of accuracy, then the double 1.2345 MIGHT represent the same value as the float. In this case you'd probably be better to do the comparison at the precision of the float , or more generally, at the error level of the most inaccurate representation in the comparison).如果您将相同的float升级为double ,那么增加的精度不会成为问题,除非您将floatdouble进行比较(假设您的 float 为 1.234,并且您只有 4 个小数位的精度,那么double 1.2345 MIGHT 表示与 float 相同的值。在这种情况下,您最好以float的精度进行比较,或者更一般地说,在比较中最不准确的表示的错误级别)。
  • If you know the number you'll be comparing with, you can follow the advice quoted above.如果您知道要比较的数字,则可以遵循上面引用的建议。
  • If you're comparing arbitrary numbers (which could be zero or non-zero), there's no way to compare them correctly in all cases - pick one comparison and know its limitations.如果您要比较任意数字(可能为零或非零),则无法在所有情况下都正确地比较它们 - 选择一个比较并了解其局限性。

A couple of practical considerations (since this sounds like it's for an assignment):一些实际考虑(因为这听起来像是一项任务):

  • The epsilon comparison mentioned by most is probably fine (but include a discussion of the limitations in the write up).大多数人提到的 epsilon 比较可能很好(但包括对文章中限制的讨论)。 If you're ever planning to compare doubles to floats, try to do it in float, but if not, try to do all comparisons in double.如果您打算将双精度数与浮点数进行比较,请尝试以浮点数进行比较,但如果不是,请尝试以双精度数进行所有比较。 Even better, just use double s everywhere.更好的是,到处都使用double s。

  • If you want to totally ace the assignment, include a write-up of the issues when comparing floats and the rationale for why you chose any particular comparison method.如果您想完全完成作业,请写下比较浮点数时的问题以及您选择任何特定比较方法的理由。

I don't understand why you're doing this at all.我完全不明白你为什么要这样做。 The == operator already caters for all possible types on both sides, with extensive rules on type coercion and widening which are already specified in the relevant language standards. ==运算符已经满足了双方所有可能的类型,在相关语言标准中已经规定了关于类型强制和扩展的广泛规则。 All you have to do is use it.您所要做的就是使用它。

I'm perhaps not answering the OP's question but rather responding to some more or less fuzzy advice which require clarifications.我可能不是在回答 OP 的问题,而是在回应一些或多或少需要澄清的模糊建议。

Comparing two floating point values for equality is absolutely possible and can be done.比较两个浮点值是否相等是绝对可能的,并且可以做到。 If the type is single or double precision is often of less importance.如果类型是单精度或双精度,则精度通常不太重要。

Having said that the steps leading up to the comparison itself require great care and a thorough understanding of floating-point dos and don'ts, whys and why nots.话虽如此,导致比较本身的步骤需要非常小心,并且对浮点的注意事项和不注意事项、原因和原因有透彻的理解。

Consider the following C statements:考虑以下 C 语句:

result = a * b / c;
result = (a * b) / c;
result = a * (b / c);

In most naive floating-point programming they are seen as "equivalent" ie producing the "same" result.在大多数朴素的浮点编程中,它们被视为“等效”,即产生“相同”的结果。 In the real world of floating-point they may be.在现实的浮点世界中,它们可能是。 Or actually, the first two are equivalent (as the second follows C evaluation rules, ie operators of same priority left to right).或者实际上,前两个是等价的(因为第二个遵循 C 评估规则,即从左到右具有相同优先级的运算符)。 The third may or may not be equivalent to the first twp.第三个可能等同于第一个 twp,也可能不等同。

Why is this?为什么是这样?

"a * b / c" or "b / c * a" may cause the "inexact" exception ie an intermediate or the final result (or both) is (are) not exact(ly representable in floating point format). “a * b / c”或“b / c * a”可能会导致“不精确”异常,即中间或最终结果(或两者)不精确(只能以浮点格式表示)。 If this is the case the results will be more or less subtly different.如果是这种情况,结果或多或少会略有不同。 This may or may not lead to the end results being amenable to an equality comparison.这可能会或可能不会导致最终结果适合进行平等比较。 Being aware of this and single-stepping through operations one at a time - noting intermediate results - will allow the patient programmer to "beat the system" ie construct a quality floating-point comparison for practically any situation.意识到这一点并一次单步执行一个操作——注意中间结果——将允许有耐心的程序员“打败系统”,即为几乎任何情况构建高质量的浮点比较。

For everyone else, passing over the equality comparison for floating-poiny numbers is good, solid advice.对于其他人来说,跳过浮点数的相等比较是很好的、可靠的建议。

It's really a bit ironic because most programmers know that integer math results in predictable truncations in various situations.这真的有点讽刺,因为大多数程序员都知道 integer 数学会在各种情况下导致可预测的截断。 When it comes to floating-point almost everyone is more or less thunderstruck that results are not exact.当谈到浮点时,几乎每个人都或多或少对结果不准确感到震惊。 Go figure. Go 图。

You should be okay to make that cast as long as the equality test involves a delta.只要相等性测试涉及增量,您就可以进行该转换。

For example: abs((double) floatVal1 - (double) floatVal2) <.000001 should work.例如: abs((double) floatVal1 - (double) floatVal2) <.000001应该有效。

Edit in response to the question change编辑以响应问题更改

No you would not.不,你不会。 The above still stands.以上仍然成立。

For the comparison between float f and double d, you can calculate the difference of f and d.对于float f和double d的比较,可以计算f和d的差值。 If abs(fd) is less than some threshold, you can think of the equality holds.如果 abs(fd) 小于某个阈值,您可以认为等式成立。 These threshold could be either absolute or relative as your application requirement.根据您的应用要求,这些阈值可以是绝对的或相对的。 There are some good solutions Here . 这里有一些很好的解决方案。 And I hope it helpful.我希望它有所帮助。

Would I ever get an incorrect result if I promote 2 floats to double and do a 64bit comparison rather than a 32bit comparison?如果我将 2 个浮点数提升为 double 并进行 64 位比较而不是 32 位比较,我会得到不正确的结果吗?

No.不。

If you start with two floats, which could be float variables (float x = foo();) or float constants (1.234234234f) then you can compare them directly, of course.如果您从两个浮点数开始,它们可以是浮点变量 (float x = foo();) 或浮点常量 (1.234234234f),那么您当然可以直接比较它们。 If you convert them to double and then compare them then the results will be identical.如果将它们转换为 double 然后比较它们,那么结果将是相同的。

This works because double is a super-set of float.这是可行的,因为 double 是 float 的超集。 That is, every value that can be stored in a float can be stored in a double.也就是说,每个可以存储在 float 中的值都可以存储在 double 中。 The range of the exponent and mantissa are both increased.指数和尾数的范围都增加了。 There are billions of values that can be stored in a double but not in a float, but there are zero values that can be stored in a float but not a double.有数十亿个值可以存储在 double 中但不能存储在 float 中,但是有零个值可以存储在 float 中但不能存储在 double 中。

As discussed in my float comparison article it can be tricky to do a meaningful comparison between float or double values, because rounding errors may have crept in. But, converting both numbers from float to double doesn't not change this.正如我在浮点数比较文章中所讨论的那样,在浮点数或双精度值之间进行有意义的比较可能很棘手,因为舍入错误可能已经悄悄出现。但是,将两个数字从浮点数转换为双精度数并没有改变这一点。 All of the mentions of epsilons (which are often but not always needed) are completely orthogonal to the question.所有提到的 epsilons(通常但并非总是需要)都与问题完全正交。

On the other hand, comparing a float to a double is madness .另一方面, 将 float 与 double 进行比较是疯狂的 1.1 (a double) is not equal to 1.1f (a float) because 1.1 cannot be exactly represented in either. 1.1(双精度)不等于 1.1f(浮点数),因为 1.1 不能在任何一个中精确表示。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM