简体   繁体   English

算法的最坏情况如何有不同的界限?

[英]How can the worst case for an algorithm have different bounds?

I've been trying to figure this out all day. 我一直在努力解决这一问题。 Some other threads address this, but I really don't understand the answers. 其他一些线程解决了这个问题,但是我真的不明白答案。 There are also many answers that contradict one another. 也有许多彼此矛盾的答案。

I understand that an algorithm will never take longer than the upper bound and never be faster than the lower bound. 我知道算法永远不会花费超过上限,也不会快于下限。 However, I didn't know an upper bound existed for best case time and a lower bound existed for worst case time. 但是,我不知道在最佳情况下存在上限,而在最坏情况下存在下限。 This question really threw me in a loop. 这个问题真的让我陷入困境。 I can't wrap my head around this... a given run time can have a different upper and lower bound? 我无法解决这个问题……给定的运行时间可以具有不同的上限和下限?

For example, if someone asked: "Show that the worst-case running time of some algorithm on a heap of size n is Big Omega(lg(n))". 例如,如果有人问:“证明某个算法在大小为n的堆上最坏的运行时间是Big Omega(lg(n))”。 How do you possibly get a lower bound, any bound for that matter, when given a run time? 在给定运行时间的情况下,您如何可能获得下界?

So, in summation, an algorithm's worst case upper bound can be different than its worst case lower bound? 因此,总而言之,算法的最坏情况下限可以不同于其最坏情况下限吗? How can this be? 怎么会这样? Once given the case, don't bounds become irrelevant? 一旦有了案例,界限就变得无关紧要了吗? Trying to independent study algorithms and I really need to wrap my head around this first. 试图独立研究算法,我真的需要首先解决这个问题。

The meat of my accepted answer to that question is a function whose running time oscillates between n^2 and n^3 depending on whether n is odd. 我对该问题的公认答案是一个函数,根据n是否为奇数,其运行时间在n ^ 2和n ^ 3之间振荡。 The point that I was trying to make is that sometimes bounds of the form O(n^k) and Omega(n^k) aren't sufficiently descriptive, even though the worst case running time is a perfectly well defined function (which, like all functions, is its own best lower and upper bound). 我试图说明的一点是,有时O(n ^ k)和Omega(n ^ k)形式的边界没有足够的描述性,即使最坏的情况下运行时间是一个定义良好的函数(像所有功能一样,是其自身最好的上下限)。 This happens with more natural functions like n log n, which is Omega(n^k) but not O(n^k) for k ≤ 1, and O(n^k) but not Omega(n^k) for k > 1 (and hence not Theta(n^k) regardless of how we choose a constant k). 这发生在更自然的函数上,例如n log n,对于k≤1,它是Omega(n ^ k),但不是O(n ^ k);对于k>,则是O(n ^ k),但不是Omega(n ^ k)。 1(因此,无论我们如何选择常数k,Theta(n ^ k)都不是)。

Big O notation, describes efficiency in runtime iterations, generally based on size of an input data set. Big O表示法通常基于输入数据集的大小来描述运行时迭代的效率。 The notation is written in its simplest form, ignoring multiples or additives, but keeping exponential form. 该符号以最简单的形式编写,忽略了倍数或加法器,但保持指数形式。 If you have an operation of O(1) it is executed in constant time, no matter the input data. 如果您具有O(1)运算,则无论输入数据如何,它都将在恒定时间内执行。

However if you have something such as O(N) or O(log(N)) , they will execute at different rates depending on input data. 但是,如果您有O(N)O(log(N)) ,则它们将根据输入数据以不同的速率执行。

The high and low bounds describe the largest and least iterations, respectively, that an algorithm can take. 上限和下限分别描述算法可以进行的最大和最小迭代。

Example: O(N) , high bound is largest input data and low bound is smallest. 示例: O(N) ,上限是最大的输入数据,下限是最小的。

Extra sources: Big O Cheat Sheet and MIT Lecture Notes 额外资料: Big O 备忘 MIT讲义

UPDATE: Looking at the Stack Overflow question mentioned above, that algorithm is broken into three parts, where it has 3 possible types of runtime, depending on data. 更新:查看上面提到的堆栈溢出问题,该算法分为三部分,根据数据的不同,它具有3种可能的运行时类型。 Now really, this is three different algorithms designed to handle for different data values. 实际上,这是设计用于处理不同数据值的三种不同算法。 An algorithm is generally classified with just one notation of efficiency and that is of the notation taking the least time for ALL possible values of N. 通常仅用效率的一种表示法对算法进行分类,即对于所有可能的N值,该表示法花费的时间最少。

In the case of O(N^2) , larger data will take exponentially longer, and having a smaller number will proceed quickly. O(N ^ 2)的情况下,较大的数据将以指数方式花费更长的时间,而较小的数据将迅速进行。 The algorithm determines how quickly a data set will be run, yet bounds are given depending on the range of data the algorithm is designed to handle. 该算法确定了数据集运行的速度,但是根据算法设计要处理的数据范围,给出了界限。

Suppose you write a program like this to find the smallest prime factor of an integer: 假设您编写这样的程序来查找整数的最小素数:

function lpf(n):
  for i = 2 to n
    if n%i == 0 then return i

If you run the function on the number 10^11 + 3, it will take 10^11 + 2 steps. 如果在数字10 ^ 11 + 3上运行该函数,将需要10 ^ 11 + 2步。 If you run it on the number 10^11 + 4 it will take just one step. 如果将其以10 ^ 11 + 4的数字运行,则只需一步。 So the function's best-case time is O(1) steps and its worst-case time is O(n) steps. 因此,函数的最佳情况时间为O(1)步,最坏情况时间为O(n)步。

I will try to explain it in the quicksort algorithm. 我将尝试在quicksort算法中对其进行解释。 In quicksort you have an array and choose an element as pivot. 在quicksort中,您有一个数组,然后选择一个元素作为支点。 The next step is to partition the input array into two arrays. 下一步是将输入数组划分为两个数组。 The first one will contain elements < pivot and the second one elements > pivot. 第一个将包含元素<枢轴,第二个元素>枢轴。 Now assume you will apply quicksort on an already sorted list and the pivot element will always be the last element of the array. 现在,假设您将对已经排序的列表应用quicksort,并且pivot元素将始终是数组的最后一个元素。 The result of partition will be an array of size n-1 and an array oft size 1 (the pivot element). 分区的结果将是大小为n-1的数组和大小为1的数组(枢轴元素)。 This will result in a runtime of O(n*n). 这将导致运行时间为O(n * n)。 Now assume that the pivot element will always split the array in two equal sized array. 现在假设数据透视元素将始终将数组拆分为两个大小相等的数组。 In every step the array size will be cut in halves. 在每一步中,阵列的大小都会减半。 This will result in O(n log n). 这将导致O(n log n)。 I hope this example will make this a bit clearer for you. 我希望这个例子可以使您更清楚。 Another well known sort algorithm is mergesort. 另一种众所周知的排序算法是mergesort。 Mergesort has always runtime of O(n log n). Mergesort的运行时始终为O(n log n)。 In mergesort you will cut the array down until only one element is left und will climb up the call stack to merge the one sized arrays and after that merge the array of size two and so on. 在mergesort中,您将切下数组,直到只剩下一个元素为止,然后将爬上调用堆栈以合并一个大小的数组,然后合并大小为2的数组,依此类推。

Let's say you implement a set using an array. 假设您使用数组实现了一个集合。 To insert a element you simply put in the next available bucket. 要插入元素,您只需将其放入下一个可用存储桶即可。 If there is no available bucket you increase the capacity of the array by a value m . 如果没有可用的存储桶,则将数组的容量增加一个值m

For the insert algorithm "there is no enough space" is the worse case. 对于插入算法, “空间不足”是最糟糕的情况。

 insert (S, e)
   if size(S) >= capacity(S) 
     reserve(S, size(S) + m)
   put(S,e)

Assume we never delete elements. 假设我们从不删除元素。 By keeping track of the last available position, put , size and capacity are Θ(1) in space and memory. 通过跟踪最后一个可用位置, putsizecapacity在空间和内存中为Θ(1)

What about reserve ? reserve呢? If it is implemented like [realloc in C][1], in the best case you just allocate new memory at the end of the existing memory (best case for reserve), or you have to move all existing elements as well (worse case for reserve). 如果像[realloc in C] [1]那样实现,则在最佳情况下,您只需在现有内存的末尾分配新内存(最佳情况为保留),否则您也必须移动所有现有元素(更坏的情况)备用)。

  • The worst case lower bound for insert is the best case of reserve() , which is linear in m if we dont nitpick. insert最坏情况下限是reserve()的最佳情况,如果我们不nitpick,则它在m中是线性的。 insert in worst case is Ω(m) in space and time. 在最坏的情况下, insert时空为Ω(m)
  • The worst case upper bound for insert is the worse case of reserve() , which is linear in m+n . insert最坏情况上限是reserve()最坏情况 ,它在m+n是线性的。 insert in worst case is O(m+n) in space and time. 在最坏的情况下, insert时空为O(m+n)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM