简体   繁体   English

查找最大双精度值的最有效算法

[英]Most effective Algorithm to find maximum of double-precision values

What is the most effective way of finding a maximum value in a set of variables? 在一组变量中找到最大值的最有效方法是什么?

I have seen solutions, such as 我看到了解决方案, 例如

private double findMax(double... vals) {
double max = Double.NEGATIVE_INFINITY;

for (double d : vals) {
   if (d > max) max = d;
}
    return max;
}

But, what would be the most effective algorithm for doing this? 但是,最有效的算法是什么?

You can't reduce the complexity below O(n) if the list is unsorted... but you can improve the constant factor by a lot. 如果列表未排序,则无法将复杂度降低到O(n)以下...但是您可以将常数因子提高很多。 Use SIMD. 使用SIMD。 For example, in SSE you would use the MAXSS instruction to perform 4-ish compare+select operations in a single cycle. 例如,在SSE中,您将使用MAXSS指令在单个周期内执行4位比较+选择操作。 Unroll the loop a bit to reduce the cost of loop control logic. 稍微展开循环以降低循环控制逻辑的成本。 And then outside the loop, find the max out of the four values trapped in your SSE register. 然后在循环之外,从SSE寄存器中捕获的四个值中找出最大值。

This gives a benefit for any size list... also using multithreading makes sense for really large lists. 这对于任何大小的列表都有好处...对于真正的大列表,使用多线程也是有意义的。

Assuming the list does not have elements in any particular order, the algorithm you mentioned in your question is optimal. 假设列表中没有按任何特定顺序排列的元素,那么您在问题中提到的算法是最佳的。 It must look at every element once, thus it takes time directly proportional to the to the size of the list, O(n) . 它必须查看每个元素一次,因此花费的时间与列表的大小O(n)成正比。

There is no algorithm for finding the maximum that has a lower upper bound than O(n) . 没有找到上限小于O(n)的最大值的算法。

Proof: Suppose for a contradiction that there is an algorithm that finds the maximum of a list in less than O(n) time. 证明:假设有一个矛盾,有一种算法可以在不到O(n)时间内找到列表的最大值。 Then there must be at least one element that it does not examine. 然后必须至少有一个它不检查的元素。 If the algorithm selects this element as the maximum, an adversary may choose a value for the element such that it is smaller than one of the examined elements. 如果算法选择该元素为最大值,则对手可能会为该元素选择一个值,使其小于被检查的元素之一。 If the algorithm selects any other element as the maximum, an adversary may choose a value for the element such that it is larger than the other elements. 如果算法选择任何其他元素作为最大值,则对手可能会为该元素选择一个值,使其大于其他元素。 In either case, the algorithm will fail to find the maximum. 无论哪种情况,该算法都无法找到最大值。

EDIT: This was my attempt answer, but please look at the coments where @BenVoigt proposes a better way to optimize the expression 编辑:这是我的尝试答案,但是请看一下@BenVoigt提出了一种优化表达式的更好方法的评论


  • You need to traverse the whole list at least once 您需要至少遍历整个列表一次
  • so it'd be a matter of finding a more efficient expression for if (d>max) max=d , if any. 因此,对于if (d>max) max=d (如果有if (d>max) max=d ,找到一个更有效的表达式是一个问题。

Assuming we need the general case where the list is unsorted (if we keep it sorted we'd just pick the last item as @IgnacioVazquez points in the comments), and researching a little about branch prediction ( Why is it faster to process a sorted array than an unsorted array? , see 4th answer) , looks like 假设我们需要列表未排序的一般情况(如果我们将其保持排序,那么我们将在评论中将最后一项选择为@IgnacioVazquez点),然后对分支预测进行一些研究( 为什么处理排序的索引更快?数组比未排序的数组? ,请参阅第4个答案),看起来像

 if (d>max) max=d;

can be more efficiently rewritten as 可以更有效地重写为

 max=d>max?d:max; 

The reason is, the first statement is normally translated into a branch ( though it's totally compiler and language dependent, but at least in C and C++, and even in a VM-based language like Java happens ) while the second one is translated into a conditional move . 原因是,第一个语句通常翻译为分支尽管它完全依赖于编译器和语言,但是至少在C和C ++中,甚至在Java等基于VM的语言中也会发生 ),而第二个语句则翻译为有条件的行动

Modern processors have a big penalty in branches if the prediction goes wrong (the execution pipelines have to be reset), while a conditional move is an atomic operation that doesn't affect the pipelines. 如果预测出错(必须重置执行管道),则现代处理器在分支机构中将遭受重大损失,而条件移动是原子操作,不会影响管道。

The random nature of the elements in the list (one can be greater or lesser than the current maximum with equal probability) will cause many branch predictions to go wrong. 列表中元素的随机性质(一个概率可以相等于大于或小于当前最大值)将导致许多分支预测出错。

Please refer to the linked question for a nice discussion of all this, together with benchmarks. 请参考链接的问题,对所有这些以及基准进行很好的讨论。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 需要使用浮点数来提高性能,但需要进行双精度计算 - Need to use floats for performance yet want double-precision calculations 对双精度使用融合乘法累加有多大优势? - How advantageous is using fused multiply-accumulate for double-precision? 双精度运算:32 位与 64 位机器 - Double-precision operations: 32-bit vs 64-bit machines 是否可以使用 SIMD 对 C 中的非平凡循环进行矢量化? (重用一个输入的多重长度5个双精度点积) - Is it possible to vectorize non-trivial loop in C with SIMD? (multiple length 5 double-precision dot products reusing one input) tsql比较where子句中三个日期值的最有效方法? - tsql most effective way to compare three date values in where clause? Python在部分更改的数组中查找最大值索引的最有效方法 - Python most efficient way to find index of maximum in partially changed array 计算双精度的gflops - Calculation of gflops for double precision 如何在面板中查找变量的最大值 - how to find maximum values of a variable in panel 最有效的算法,用于查找三个值中较大和较小的值 - Most efficient algorithm for finding the larger and smaller of three values 在单精度CPU上执行双精度计算 - Doing double precision calculation on single precision CPU
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM