在双精度列表中找到最大数，是否需要关注精度？

Question

Suppose have a vector, and need to find out the max number. 假设有一个向量，并且需要找出最大数。 My colleague told me that double numbers must be treated in a special way to do this, his code is like: 我的同事告诉我，必须以特殊方式处理双数，他的代码如下：

double max = v[0];
for (int i = 0; i < v.size(); ++i) {
   if (compare(max, v[i]) < 0) max = v[i];
}

int compare(double a, double b) {
    double z = a - b;
    if (z < -0.000001) return -1;
    if (z > 0.000001) return 1;
    return 0;
}

I don't think it needs to be that complex, simply using '<' or '>' can do the job, it shouldn't care about equals. 我认为它不需要那么复杂，只需使用'<'或'>'就可以完成工作，它不必关心相等。 But my colleague insists that compare with epsilon is must for finding out the max number. 但是我的同事坚持认为，与epsilon比较必须找出最大数量。 Is that true? 真的吗？

Answer 1

For the purpose of finding the largest number, you're right, a simple < suffices. 为了找到最大的数字，您是对的，简单的<就足够了。

What your colleague is thinking of (we hope, anyway) is dealing with equality of floating point numbers. 您的同事在想（无论如何，我们希望）正在处理浮点数的相等性。 For example, if you're computing what should be the same value in two different ways, you might easily see some minute difference between the two--even (a+b)+c vs. a+(b+c) can change a result (even though from a mathematical viewpoint, the two should be identical). 例如，如果您以两种不同的方式计算应为相同值的值，则可能会轻易看到两者之间的微小差异-即使(a+b)+c与a+(b+c)也会改变a结果（即使从数学角度来看，两者应相同）。

When you do this, however, you normally want to scale the difference you allow between the numbers based on the magnitude of the numbers. 但是，执行此操作时，通常需要根据数字的大小来缩放数字之间允许的差异。

For example, let's consider a double that can represent about 15 significant digits. 例如，让我们考虑一个可以代表大约15个有效数字的double精度数。

If your numbers are around 1e+200, then the smallest difference between those two numbers that can be represented is approximately 1e+185. 如果您的数字在1e + 200左右，则可以表示的这两个数字之间的最小差约为1e + 185。 Asking whether the difference is smaller than 0.000001 is pointless--either the results are identical, or else the difference is much larger than that. 问差是否小于0.000001小是没有意义的-无论结果是相同的，否则差比大得多 。

Contrariwise, if your numbers were around 1e-200, then the smallest difference between them that could be represented would be around 1e-215. 相反，如果您的数字在1e-200左右，则可以表示的数字之间的最小差约为1e-215。 A difference of 0.000001 is (again) utterly ridiculous to contemplate--it could only happen if one of the two calculations was incorrect by ~196 orders of magnitude ¹ . 再次考虑到0.000001的差异是完全荒谬的-仅在两个计算之一的错误程度约为196个数量级^1时才会发生。

So, to do this, you pick a number of places after the decimal point that need to match for you to consider the two equal, and multiply that by the numbers to get a maximum delta. 因此，要执行此操作，请在小数点后选择一些需要匹配的位置，以使您认为这两个位置相等，然后将其乘以数字即可得到最大增量。 For example, if you decide they need to agree to 7 decimal places, and the numbers are in the range of 1eN, then the maximum difference is 1e(N-7). 例如，如果您决定他们需要同意小数点后7位，并且数字在1eN的范围内，则最大差值为1e（N-7）。 If N is 100, then the maximum delta is 1e93. 如果N为100，则最大增量为1e93。 If N is -150, then the maximum delta is 1e-157. 如果N为-150，则最大增量为1e-157。

This still needs to be used with considerable care, especially when dealing with groups of numbers. 仍然需要非常小心地使用它，尤其是在处理数字组时。 The problem is that with an approximately equal like this is no longer transitive. 问题是这样的近似相等不再具有传递性。 Even if the numbers are in a range where an Epsilon of 0.000001 might make some sense, it can say that A == B, and B == C, but A != C. This can lead to quite surprising results, to put it mildly (in some cases like sorting, can lead to complete failure, because you're violated the requirement for a strict weak ordering). 即使数字处于0.000001的Epsilon可能有意义的范围内，也可以说A == B，并且B == C，但是A！=C。这可能会导致非常令人惊讶的结果轻度（在某些情况下，例如排序，可能会导致完全失败，因为您违反了严格的弱排序要求）。

As far as looking at your original problem of finding the maximum value in a vector goes, this basically means that just using < will find the value that's actually the largest. 就查找在向量中找到最大值的原始问题而言，这基本上意味着仅使用<找到实际上最大的值。 Depending on roundoff error, however, there may be other values that are smaller, but should theoretically be larger. 根据舍入误差，但是，有可能是较小的其他值，但理论上应该是较大的。 Depending on how those results were calculated and what you're trying to accomplish, you might want to consider finding not just the single value that's the largest, but all the other values that are within a chosen maximum error of the largest. 根据这些结果的计算方式和您要完成的工作，您可能要考虑不仅查找最大的单个值，而且查找选定的最大错误最大值内的所有其他值。 The others would be values that aren't quite as large, but are close enough that they could represent the largest measurement (or whatever exactly you're working with). 其他的值虽然不那么大，但足够接近，可以代表最大的度量值（或您正在使用的任何精确值）。

There's one other point that's probably worth mentioning: some floating point formats include representations that are "not a number". 还有一点可能值得一提：一些浮点格式包括“不是数字”的表示形式。 This will produce false for every possible comparison (in fact, a common way of detecting a NaN is if (x != x) /* it's a NaN */ ). 这对于每次可能的比较都会产生false （实际上，检测NaN的常见方法是if (x != x) /* it's a NaN */ ）。 As such, if your input values might include a NaN, seemingly identical comparisons could give entirely different results. 因此，如果您的输入值可能包含NaN，则看似相同的比较可能会得出完全不同的结果。 For example, if x < y and if not y >= x should normally be the same, but if either x or y is a NaN, they won't be. 例如， if x < y并且if not y >= x通常应该是相同的，但是如果x或y是NaN，则它们不会相同。

To try to put 196 orders of magnitude in perspective, let's assume you were doing calculations attempting to compare the size of a neutron with the size of a proton. 为了尝试将196个数量级视为透视图，让我们假设您正在进行计算，试图将中子的大小与质子的大小进行比较。 Then you decide to check whether the difference you got between those two sizes was greater than the diameter of the Milky Way galaxy. 然后，您决定检查这两个大小之间的差异是否大于银河系的直径。
Oh, but that wouldn't be 196 orders of magnitude. 哦，但这不是196个数量级。 That's only about 36 orders of magnitude. 大约只有36个数量级。 So let's check if the difference we got was larger than the (currently believed) diameter of the universe. 因此，让我们检查一下我们得到的差值是否大于（当前认为的）宇宙直径。 That gets us up to around 50 orders of magnitude. 这使我们上升到大约50个数量级。
I guess I've kind of failed at putting it into perspective though--even if we look at the smallest size that string theory attributes to a "string" and compare that to the (generally accepted) size of the universe, the two aren't even close to 196 orders of magnitude apart. 我想我虽然未能正确地将其放到视野中，即使我们将弦论归因于“弦”的最小尺寸与宇宙的（通常被接受的）尺寸进行比较，两者仍然存在。相差196个数量级。 Worse, I doubt the size of a string or the size of the known universe is something anybody can really visualize meaningfully anyway. 更糟糕的是，我怀疑字符串的大小或已知宇宙的大小是否是任何人都可以真正有意义地可视化的东西。 The one is too small and the other too large for anybody to really grasp, and the difference between the two is still drastically short of the difference we're talking about. 一个太小了，另一个太大了，任何人都无法真正掌握，两者之间的差异仍然远远小于我们在谈论的差异。

Answer 2

Your colleague is wrong, in this case result of operation < is enough. 您的同事是错误的，在这种情况下，操作<结果就足够了。 Comparing with epsilon could be necessary if you need to find out how many max elements in container etc. Just to find a max element simple comparison is enough. 如果您需要找出容器等中有多少个最大元素，则有必要与epsilon进行比较。只需找到一个简单元素即可。

在双精度列表中找到最大数，是否需要关注精度？

问题描述

2 个解决方案

解决方案1
3 2016-10-27 15:39:31

解决方案2
1 2016-10-27 15:02:58

在双精度列表中找到最大数，是否需要关注精度？

问题描述

2 个解决方案

解决方案1 3 2016-10-27 15:39:31

解决方案2 1 2016-10-27 15:02:58

解决方案1
3 2016-10-27 15:39:31

解决方案2
1 2016-10-27 15:02:58