简体   繁体   English

渐近符号有缺陷吗?

[英]Are asymptotic notations flawed?

The best-case complexity of any algorithm is the minimum amount of time that the algorithm will take to accomplish its task.任何算法的最佳情况复杂度是算法完成其任务所需的最短时间。 We know that the best case complexity of algorithms like merger sort, quick sort, etc is Ω(n log(n)), which defines the lower bound of these algorithms.我们知道合并排序、快速排序等算法的最佳情况复杂度是 Ω(n log(n)),它定义了这些算法的下限。

As we know that in asymptotic notations -正如我们所知,在渐近符号中 -

O(n) + O(n log(n)) = O(n log(n)) O(n) + O(n log(n)) = O(n log(n))

also,还,

Ω(n) + Ω(n log(n)) = Ω(n log(n)) Ω(n) + Ω(n log(n)) = Ω(n log(n))

So, if in these sorting algorithms, we first traverse the entire array in O(n) time to determine if the array is already sorted in ascending or descending order, then asymptotically their average case and worst case complexities would remain the same.因此,如果在这些排序算法中,我们首先在 O(n) 时间内遍历整个数组以确定该数组是否已经按升序或降序排序,那么它们的平均情况和最坏情况复杂度将渐近保持不变。 But their best case complexity would now become Ω(n) .但是他们最好的情况复杂性现在将变为Ω(n)

Logically there is definitely a flaw in my way of understanding these asymptotic notations otherwise someone would definitely have had pointed this out when Asymptotic notations were being developed or becoming popular to measure sorting algorithms.从逻辑上讲,我理解这些渐近符号的方式肯定存在缺陷,否则当渐近符号正在开发或流行以测量排序算法时,肯定有人会指出这一点。 Am I correct in assuming that this is a plausible flaw in asymptotic notations or am I missing some rule of Asymptotic notations?我是否正确假设这是渐近符号中的一个似是而非的缺陷,还是我错过了一些渐近符号规则?

There are certainly problems with using asymptotic complexity as a measure of speed.使用渐近复杂度作为速度度量肯定存在问题。 First and foremost, obviously constants do matter.首先,显然常数很重要。 1000n will often be much larger than n log n , and certainly n^1000 is much larger than 2^n for any practical value of n . 1000n通常会比n log n大得多,对于 n 的任何实际值, n n^1000肯定比2^n大得多。 As it turns out, however, the asymptotic complexity is often a fairly good indicator of an algorithms actual speed.然而,事实证明,渐近复杂度通常是算法实际速度的一个相当好的指标。

The problem you raise is also correct, but I wouldn't consider it a problem.你提出的问题也是正确的,但我不认为这是一个问题。 It is true that a simple isSorted() check at the start of quicksort reduces its best case complexity to Θ(n) , but it is very rare for people to care about best case performance.确实,在快速排序开始时进行简单的isSorted()检查会将其最佳案例复杂度降低到Θ(n) ,但很少有人关心最佳案例性能。 Indeed, many algorithms for common problems can be modified to be best case linear, but this is just not very useful.事实上,许多常见问题的算法都可以修改为最佳情况线性,但这并不是很有用。

Finally, note that this is not really a flaw in asymptotic notation specifically.最后,请注意,这并不是渐近符号的真正缺陷。 Making a random guess and verifying whether the guess was correct (such as by guessing that an array is already sorted) often really does improve best case performance, while having very little effect on the average or worst case, regardless of the notation used.进行随机猜测并验证猜测是否正确(例如通过猜测数组已经排序)通常确实可以提高最佳情况的性能,而对平均或最坏情况的影响很小,无论使用哪种表示法。

First, you should distinguish in your mind between the case (best, worst, average, etc.) and the bound (upper, lower, O, Omega, Theta, etc.)首先,您应该在脑海中区分情况(最佳、最差、平均等)和界限(上限、下限、O、Omega、Theta 等)。

Let us focus on Bubble Sort, defined as follows:让我们专注于冒泡排序,定义如下:

if array == null or array.length < 2 then return
do
    swapped = false
    for i = 0 to array.length - 2
        if array[i] > array[i+1] then
            swap(array, i, i+1)
            swapped = true
until not swapped

The best case for this algorithm is a sorted array, in which case the lower (Omega), upper (O) and Theta bounds all agree the runtime is bound by a function of the form f(n) = an;该算法的最佳情况是排序数组,在这种情况下,下限 (Omega)、上限 (O) 和 Theta 界限都同意运行时由 f(n) = an 形式的 function 限制; that is, T(n) = O(n).也就是说,T(n) = O(n)。 The best case for Bubble Sort is linear.冒泡排序的最佳情况是线性的。

The worst case for this algorithm is an array in reverse-sorted order.该算法的最坏情况是反向排序的数组。 In this case, the runtime is bounded from above and below by a function like g(n) = bn^2;在这种情况下,运行时间由 function 上下限定,例如 g(n) = bn^2; T(n) = O(n^2) in the worst case.在最坏的情况下,T(n) = O(n^2)。

You aren't missing anything and it's perfectly ordinary for algorithms to have different worst-case and best-case runtime bounds.您不会遗漏任何东西,算法具有不同的最坏情况和最佳情况运行时界限是完全正常的。 It's also perfectly possible that an algorithm may not optimize for the best case since the best case is typically not the one we are worried about anyway;算法也很可能无法针对最佳情况进行优化,因为最佳情况通常不是我们担心的情况; yes, merge sort could first check to see if the array is sorted, but there are a relatively small number of those over the set of all possible arrays of length N.是的,归并排序可以首先检查数组是否已排序,但是在长度为 N 的所有可能的 arrays 的集合中,这些数组的数量相对较少。

Also, you may choose to talk about a lower bound on the worst case, or an upper bound on the best case.此外,您可以选择谈论最坏情况下的下限或最佳情况下的上限。 These things are not what we typically focus on - instead focusing on upper bound on the worst case, or possibly lower bound on the best case - but the case and the bound are totally separate things and can be combined arbitrarily.这些事情不是我们通常关注的——而是关注最坏情况的上限,或者可能是最好情况的下限——但情况和边界是完全独立的事物,可以任意组合。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM