简体   繁体   English

如何找到递推关系,并计算归并排序码的主定理?

[英]How to find the recurrence relation, and calculate Master Theorem of a Merge Sort Code?

Im trying to find the Master Theorem of this Merge Sort Code, but first i need to find its recurrence relation, but i´m struggling to do and understand both.我试图找到这个合并排序代码的主定理,但首先我需要找到它的递归关系,但我很难做到并理解两者。 i Already saw some similar questions here, but couldnt understand the explanations, like, first i need to find how many operations the code has?我已经在这里看到了一些类似的问题,但无法理解解释,例如,首先我需要找出代码有多少操作? Could someone help me with that?有人可以帮我吗?


def mergeSort(alist):
    print("Splitting ",alist)
    if len(alist)>1:
        mid = len(alist)//2
        lefthalf = alist[:mid]
        righthalf = alist[mid:]

        mergeSort(lefthalf)
        mergeSort(righthalf)

        i=0
        j=0
        k=0
        while i < len(lefthalf) and j < len(righthalf):
            if lefthalf[i] < righthalf[j]:
                alist[k]=lefthalf[i]
                i=i+1
            else:
                alist[k]=righthalf[j]
                j=j+1
            k=k+1

        while i < len(lefthalf):
            alist[k]=lefthalf[i]
            i=i+1
            k=k+1

        while j < len(righthalf):
            alist[k]=righthalf[j]
            j=j+1
            k=k+1
    print("Merging ",alist)

alist = [54,26,93,17,77,31,44,55,20]
mergeSort(alist)
print(alist)

Ok, lets start with analyzing.好的,让我们开始分析。 Assumption is your list has n elements and n is a power of 2 (avoiding the problem to have sublists of size (n+1)/2 and (n-1)/2 without loss of generality).假设您的列表有n元素,并且n是 2 的幂(避免出现大小为(n+1)/2 and (n-1)/2的子列表而不失一般性的问题)。 I'm skipping prints and some commands with constant time ( +c' ).我正在跳过打印和一些具有恒定时间 ( +c' ) 的命令。

    if len(alist)>1:
        mid = len(alist)//2

counting can be done in linear time by going through all elements.通过遍历所有元素,可以在线性时间内完成计数。 A division by 2 does not change the overall behaviour:除以 2 不会改变整体行为:
T(n) = (a_1)*n + (a_2)*n +... + c'

        lefthalf = alist[:mid]
        righthalf = alist[mid:]

Splitting a list can be interpreted as copying a list, so linear complexity拆分列表可以理解为复制列表,所以线性复杂度
T(n) = (a_1)*n + (a_2)*n + (a_3)*(n/2)*2 +... + c'

        mergeSort(lefthalf)
        mergeSort(righthalf)

Here is the tricky part.这是棘手的部分。 You know nothing about the time complexity of the function mergeSort yet.您对mergeSort合并排序的时间复杂度一无所知。 But you know the input size of the calls: n/2 resutling in但是您知道调用的输入大小: n/2重新调用
T(n) = (a_1)*n + (a_2)*n + (a_3)*n + T(n/2) + T(n/2) +... + c'

        while i < len(lefthalf) and j < len(righthalf):
            if lefthalf[i] < righthalf[j]:
                alist[k]=lefthalf[i]
                i=i+1
            else:
                alist[k]=righthalf[j]
                j=j+1
            k=k+1

Here starts a loop, meaning every one of the commands of the loop could be executet as often as the loop repeats.这里开始一个循环,这意味着循环的每个命令都可以像循环重复一样频繁地执行。 Thankfully the content are only commands with constant time.值得庆幸的是,内容只是具有恒定时间的命令。 The only difficult part is to determine the number of iterations which can only be (len(lefthalf) -1) + (len(righthalf) -1) which is n-2 at most (and at least n/2-1 ).唯一困难的部分是确定最多只能为(len(lefthalf) -1) + (len(righthalf) -1)的迭代次数,最多为n-2 (并且至少为n/2-1 )。 The while conditions will be checked once more, resulting in time complexity upper bound (a_4)*(n-2) + c_4 = (a_4)*n - 2*a_4 + c_4 = (a_4)*n + c_4'将再次检查while条件,导致时间复杂度上限(a_4)*(n-2) + c_4 = (a_4)*n - 2*a_4 + c_4 = (a_4)*n + c_4'

Similar stuff happens with the other two loops (commands inside are constants, maximal n/2 loops, constant overhead in case while condition is false), upper bound:其他两个循环也会发生类似的事情(里面的命令是常量,最大n/2循环,如果条件为假,则开销恒定),上限:
(a_5)*n/2 + c_5 = (a_5')*n + c_5 and (a_5)*n/2 + c_5 = (a_5')*n + c_5
(a_6)*n/2 + c_6 = (a_6')*n + c_6

Summa Summarum:总结总结:

T(n) = (a_1)*n + (a_2)*n + (a_3)*n + T(n/2) + T(n/2) + (a_4)*n + c_4' + (a_5')*n + c_5 + (a_6')*n + c_6 + c'
= (a_1+a_2+a_3+a_4+a_5'+a_6')*n + 2T(n/2) + c_4'+c_5+c_6+c'
= 2T(n/2) + a'*n + c

This is the formula for the upper bound (Big O), but calculating the lower bound (Big Omega) will end in the same structure only the actual (but uninteresting) values of a' and c will vary.这是上限 (Big O) 的公式,但计算下限 (Big Omega) 将以相同的结构结束,只有a'c的实际(但无意义)值会有所不同。

So know you have a formula to work with.所以知道你有一个公式可以使用。 Masters Theorem can be applied for T(n) = a*T(n/b) + f(n) .大师定理可以应用于T(n) = a*T(n/b) + f(n) You can clearly see that a=2 , b=2 and f(n) = a'*n + c .您可以清楚地看到a=2b=2f(n) = a'*n + c Now the complexity of T(n) can be sorted into one of 3 cases depending on the complexity of f(n) and the relationship of a and b .现在,根据f(n)的复杂度以及ab的关系,可以将T(n)的复杂度分为 3 种情况之一。 First you need to calculate the critical constant c_crit = log_b a = log_2 2 = 1 .首先,您需要计算临界常数c_crit = log_b a = log_2 2 = 1 And now we need the complexity of f(n) = a'*n + c = O(n) = O(n^1) (linear complexity).现在我们需要f(n) = a'*n + c = O(n) = O(n^1)的复杂度(线性复杂度)。

  • First case: f(n) = O(n^c) where c is smaller than c_crit does not apply (1 is not smaller that 1)第一种情况: f(n) = O(n^c)其中c小于c_crit不适用(1 不小于 1)
  • Third case: f(n) = Omega(n^c) where c is larger than c_crit does not apply (when Big O is the upper Bound, Big Omega is the lower bound)第三种情况: f(n) = Omega(n^c)其中c大于c_crit不适用(当 Big O 为上限时,Big Omega 为下限)
  • Second case: f(n) = Theta(n^c_crit log^kn) for a k>=0 : c_crit = 1 , k=0 .第二种情况: f(n) = Theta(n^c_crit log^kn)对于k>=0c_crit = 1 , k=0 This condition holds, so from the Master theorem, you now can conclude这个条件成立,所以从主定理,你现在可以得出结论

T(n) = Theta(n^c_crit log^(k+1) n) = Theta(n^1 log^(0+1) n) = Theta(n log(n))

So this algorithm has complexity n log(n) .所以这个算法的复杂度n log(n) Big O, Big Omega and Big Theta for f(n) are the same for this example but they may not the same for other examples.此示例中f(n)的 Big O、Big Omega 和 Big Theta 相同,但在其他示例中可能不同。

To determine the run-time of a divide-and-conquer algorithm using the Master Theorem, you need to express the algorithm's run-time as a recursive function of input size, in the form:要使用主定理确定分治算法的运行时间,您需要将算法的运行时间表示为输入大小的递归 function,格式为:

T(n) = aT(n/b) + f(n)

T(n) is how we're expressing the total runtime of the algorithm on an input size n. T(n)是我们如何在输入大小 n 上表达算法的总运行时间。

a stands for the number of recursive calls the algorithm makes. a代表算法进行的递归调用的次数。

T(n/b) represents the recursive calls: The n/b signifies that the input size to the recursive calls is some particular fraction of original input size (the divide part of divide-and-conquer). T(n/b)表示递归调用: n/b表示递归调用的输入大小是原始输入大小的某个特定部分(分治法的除法部分)。

f(n) represents the amount of work you need to do to in the main body of the algorithm, generally just to combine solutions from recursive calls into an overall solution (you could say this is the conquer part). f(n)表示您在算法主体中需要做的工作量,通常只是将递归调用的解决方案组合成一个整体解决方案(您可以说这是征服部分)。

Here's a slightly re-factored definition of mergeSort:下面是对mergeSort 的稍微重构的定义:

def mergeSort(arr):
  if len(arr) <= 1: return # array size 1 or 0 is already sorted
  
  # split the array in half
  mid = len(arr)//2
  L = arr[:mid]
  R = arr[mid:]

  mergeSort(L) # sort left half
  mergeSort(R) # sort right half
  merge(L, R, arr) # merge sorted halves

We need to determine, a , n/b and f(n)我们需要确定an/bf(n)

Because each call of mergeSort makes two recursive calls: mergeSort(L) and mergeSort(R) , a=2 :因为每次对 mergeSort 的调用都会进行两次递归调用: mergeSort(L)mergeSort(R)a=2

T(n) = 2T(n/b) + f(n)

n/b represents the fraction of the current input that recursive calls are made with. n/b表示进行递归调用的当前输入的比例。 Because we are finding the midpoint and splitting the input in half, passing one half the current array to each recursive call, n/b = n/2 and b=2 .因为我们要找到中点并将输入分成两半,将当前数组的一半传递给每个递归调用, n/b = n/2b=2 (if each recursive call instead got 1/4 of the original array b would be 4 ) (如果每个递归调用得到原始数组b的 1/4 则为4

T(n) = 2T(n/2) + f(n)

f(n) represents all the work the algorithm does besides making recursive calls. f(n)表示算法除了进行递归调用之外所做的所有工作。 Every time we call mergeSort, we calculate the midpoint in O(1) time.每次调用 mergeSort 时,我们都会计算 O(1) 时间的中点。 We also split the array into L and R , and technically creating these two sub-array copies is O(n).我们还将数组拆分为LR ,从技术上讲,创建这两个子数组副本是 O(n)。 Then, presuming mergeSort(L) , sorted the left half of the array, and mergeSort(R) sorted the right half, we still have to merge the sorted sub-arrays together to sort the entire array with the merge function.然后,假设mergeSort(L)对数组的左半部分进行排序,而mergeSort(R)对右半部分进行排序,我们仍然必须将已排序的子数组合并在一起,以使用merge function 对整个数组进行排序。 Together, this makes f(n) = O(1) + O(n) + complexity of merge .总之,这使得f(n) = O(1) + O(n) + complexity of merge Now let's take a look at merge :现在让我们看一下merge

def merge(L, R, arr):
  i = j = k = 0    # 3 assignments
  while i < len(L) and j < len(R): # 2 comparisons
    if L[i] < R[j]: # 1 comparison, 2 array idx
      arr[k] = L[i] # 1 assignment, 2 array idx
      i += 1        # 1 assignment
    else:
      arr[k] = R[j] # 1 assignment, 2 array idx
      j += 1        # 1 assignment
    k += 1          # 1 assignment

  while i < len(L): # 1 comparison
    arr[k] = L[i]   # 1 assignment, 2 array idx
    i += 1          # 1 assignment
    k += 1          # 1 assignment

  while j < len(R): # 1 comparison
    arr[k] = R[j]   # 1 assignment, 2 array idx
    j += 1          # 1 assignment
    k += 1          # 1 assignment

This function has more going on, but we just need to get it's overall complexity class to be able to apply the Master Theorem accurately.这个 function 有更多的事情要做,但我们只需要得到它的整体复杂性 class 就能够准确地应用主定理。 We can count every single operation, that is, every comparison, array index, and assignment, or just reason about it more generally.我们可以计算每一个操作,即每一个比较、数组索引和赋值,或者更一般地对其进行推理。 Generally speaking, you can say that across the three while loops we are going to iterate through every member of L and R and assign them in order to the output array, arr, doing a constant amount of work for each element.一般来说,您可以说,在三个 while 循环中,我们将遍历 L 和 R 的每个成员,并将它们分配给 output 数组 arr,为每个元素执行恒定数量的工作。 Noting that we are processing every element of L and R (n total elements) and doing a constant amount of work for each element would be enough to say that merge is in O(n).注意我们正在处理 L 和 R 的每个元素(总共 n 个元素),并且为每个元素做恒定量的工作就足以说明合并在 O(n) 中。

But, you can get more particular with counting operations if you want.但是,如果您愿意,您可以更具体地使用计数操作。 For the first while loop, every iteration we make 3 comparisons, 5 array indexes, and 2 assignments (constant numbers), and the loop runs until one of L and R is fully processed.对于第一个 while 循环,每次迭代我们进行 3 次比较、5 个数组索引和 2 个赋值(常数),并且循环运行直到 L 和 R 之一被完全处理。 Then, one of the next two while loops may run to process any leftover elements from the other array, performing 1 comparison, 2 array indexes, and 3 variable assignments for each of those elements (constant work).然后,接下来的两个 while 循环之一可能会运行以处理来自另一个数组的任何剩余元素,执行 1 个比较、2 个数组索引和每个元素的 3 个变量分配(持续工作)。 Therefore, because each of the n total elements of L and R cause at most a constant number of operations to be performed across the while loops (either 10 or 6, by my count, so at most 10), and the i=j=k=0 statement is only 3 constant assignments, merge is in O(3 + 10*n) = O(n).因此,因为 L 和 R 的 n 个总元素中的每一个都会导致在 while 循环中执行最多恒定数量的操作(根据我的计数,10 或 6 个,所以最多 10 个),并且i=j=k=0语句只有 3 个常量赋值,合并在 O(3 + 10*n) = O(n) 中。 Returning to the overall problem, this means:回到整体问题,这意味着:

f(n) = O(1) + O(n) + complexity of merge
     = O(1) + O(n) + O(n)
     = O(2n + 1)
     = O(n)

T(n) = 2T(n/2) + n

One final step before we apply the Master Theorem: we want f(n) written as n^c.在我们应用主定理之前的最后一步:我们希望 f(n) 写成 n^c。 For f(n) = n = n^1, c=1 .对于 f(n) = n = n^1, c=1 (Note: things change very slightly if f(n) = n^c*log^k(n) rather than simply n^c, but we don't need to worry about that here) (注意:如果 f(n) = n^c*log^k(n) 而不是简单的 n^c,情况会发生非常轻微的变化,但我们在这里不必担心)

You can now apply the Master Theorem, which in its most basic form says to compare a (how quickly the number of recursive calls grows) to b^c (how quickly the amount of work per recursive call shrinks).您现在可以应用主定理,它最基本的形式是比较a (递归调用的数量增长的速度)与b^c (每个递归调用的工作量减少的速度)。 There are 3 possible cases, the logic of which I try to explain, but you can ignore the parenthetical explanations if they aren't helpful:有 3 种可能的情况,我试图解释其中的逻辑,但如果括号中的解释没有帮助,你可以忽略它们:

  1. a > b^c, T(n) = O(n^log_b(a)) . a > b^c, T(n) = O(n^log_b(a)) (The total number of recursive calls is growing faster than the work per call is shrinking, so the total work is determined by the number of calls at the bottom level of the recursion tree. The number of calls starts at 1 and is multiplied by a log_b(n) times because log_b(n) is the depth of the recursion tree. Therefore, total work = a^log_b(n) = n^log_b(a)) (递归调用总数的增长速度快于每次调用的工作量减少的速度,因此总工作量由递归树底层的调用次数决定。调用次数从 1 开始,乘以a log_b(n) 次,因为 log_b(n) 是递归树的深度。因此,总工作量 = a^log_b(n) = n^log_b(a))

  2. a = b^c, T(n) = O(f(n)*log(n)) . a = b^c, T(n) = O(f(n)*log(n)) (The growth in number of calls is balanced by the decrease in work per call. The work at each level of the recursion tree is therefore constant, so total work is just f(n)*(depth of tree) = f(n)*log_b(n) = O(f(n)*log(n)) (调用次数的增长与每次调用工作量的减少相平衡。因此,递归树每一层的工作量是恒定的,所以总工作量就是 f(n)*(depth of tree) = f(n) *log_b(n) = O(f(n)*log(n))

  3. a < b^c, T(n) = O(f(n)) . a < b^c, T(n) = O(f(n)) (The work per call shrinks faster than the number of calls increases. Total work is therefore dominated by the work at the top level of the recursion tree, which is just f(n)) (每次调用的工作量减少的速度快于调用次数的增加量。因此,总工作量由递归树顶层的工作量支配,即 f(n))

For the case of mergeSort, we've seen that a = 2, b = 2, and c = 1. As a = b^c, we apply the 2nd case:对于 mergeSort 的情况,我们已经看到 a = 2、b = 2 和 c = 1。作为 a = b^c,我们应用第二种情况:

T(n) = O(f(n)*log(n)) = O(n*log(n))

And you're done.你完成了。 This may seem like a lot work, but coming up with a recurrence for T(n) gets easier the more you do it, and once you have a recurrence it's very quick to check which case it falls under, making the Master Theorem quite a useful tool for solving more complicated divide/conquer recurrences.这似乎需要做很多工作,但是为 T(n) 提出一个递归式会变得越容易,而且一旦你有一个递归式,它就可以很快地检查它属于哪种情况,这使得主定理相当解决更复杂的分/治重复的有用工具。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM