简体   繁体   English

以下代码的最坏情况复杂度

[英]Worst-Case Complexity of Below Code

So I'm a total newbie to Java and coding in general, having just learnt Big-O as well. 因此,我对Java和编码一般都是新手,也刚刚学习Big-O。 I came across this on the internet yesterday ( http://www.dsalgo.com/2013/02/MaxKUsingMinHeap.php.html ), and would like to know if the complexity analysis [O(n log k)] the code below is correct. 我昨天在互联网上( http://www.dsalgo.com/2013/02/MaxKUsingMinHeap.php.html )遇到了这个问题,想知道是否复杂度分析[O(n log k)]下面的代码是正确的。 Does it also include the worst case scenario? 它还包括最坏情况吗? I'd really appreciate if someone could go through this and explain. 如果有人可以解释一下,我将不胜感激。

import java.util.PriorityQueue;

public class MaxKUsingMinHeap {
    public static void main(String[] args) {
        int[] arr = { 
            3, 46, 2, 56, 3, 38, 93, 45, 6, 787, 34, 76, 44, 6, 7, 86, 8, 44, 56 
        };
        int[] result = getTopElements(arr, 5);
        for (int i : result) {
            System.out.print(i + ",");
        }
    }
    public static int[] getTopElements(int[] arr, int k) {
        PriorityQueue<Integer> minHeap = new PriorityQueue<Integer>();
        for (int i = 0; i < arr.length; ++i) {
            int currentNum = arr[i];
            if (minHeap.size() < k) {
                minHeap.add(currentNum);
            }
            else if (currentNum > minHeap.peek())
            {
                minHeap.poll();
                minHeap.add(currentNum);
            }
        }
        int[] result = new int[minHeap.size()];
        int index = 0;
        while (!minHeap.isEmpty()) {
            result[index++] = minHeap.poll();
        }
        return result;
    }
}

是的,该代码无论如何都不会花费超过O(n log k)的时间,因为优先级队列操作每个都占用O(log k),而您最多只能执行O(n)。

The asymptotic complexity details of the program you presented depend on the details of the PriorityQueue implementation, and those details are not documented. 您呈现的程序的渐近复杂度细节取决于PriorityQueue实现的细节,而这些细节未记录在案。 Suppose, however, that the implementation is optimal for number of operations performed by each method (in both average- and worst-case): 但是,假设该实现对于每种方法执行的操作数是最佳的(在平均和最坏情况下):

(constructor)       O(1)
size()              O(1)
add()               O(log(k))
peek()              O(1)
poll()              O(log(k))
isEmpty()           O(1)

where k is the number of elements currently in the queue. 其中k是队列中当前元素的数量。 In particular, these are the characteristics of a queue backed by a "heap" data structure, which variable naming appears to assume to be the implementation (and a very reasonable assumption it is). 特别是,这些是由“堆”数据结构支持的队列的特征,变量命名似乎假定是实现(并且非常合理的假设是)。

Now consider method getTopElements(int[] arr, int k) , and let n be arr.length . 现在考虑方法getTopElements(int[] arr, int k) ,让narr.length The method: 方法:

  • Allocates and initializes a PriorityQueue , in O(1) operations O(1)操作中分配和初始化PriorityQueue
  • Iterates over the n elements of arr , incurring the following costs at each iteration: 迭代arrn元素, 每次迭代都会产生以下成本:
    • copy a value from the array to a local variable, in O(1) operations O(1)操作中,将值从数组复制到局部变量
    • determine whether to add the current number to the queue, and if so, whether first to remove the least element. 确定是否将当前数字添加到队列,如果是,则确定是否首先删除最小元素。 The determination relies on the size() and peek() methods and on individual comparisons, so it requires O(1) operations. 确定依赖于size()peek()方法以及各个比较,因此需要O(1)运算。
    • in the event that the number is added, the cost of doing so is bounded by O(log k) because the number of elements in the queue is prevented from exceeding k . 在添加数字的情况下,这样做的成本由O(log k)限制,因为可以防止队列中元素的数量超过k If an element is first removed then the cost of the of the removal is also O(log k) operations. 如果首先删除一个元素,那么删除的代价也是O(log k)运算。 Since O(log k) + O(log k) = O(log k) , the asymptotic complexity is not increased by the sometime need to first remove an element. 由于O(log k) + O(log k) = O(log k) ,因此渐进复杂性不会因需要先删除元素而增加。
  • The method then allocates an array for the results, at a cost of O(1) operations. 然后,该方法以O(1)运算的代价为结果分配一个数组。
  • It iterates min(n, k) times over the queue (once for each element), each time removing an element at a cost bounded by O(log k) . 它在队列中迭代min(n, k)次(每个元素一次),每次以O(log k)为代价删除一个元素。

For the first loop, the cost is bounded by n * (O(1) + O(1) + O(log k)) = n * O(log k) = O(n log k) . 对于第一个循环,成本由n * (O(1) + O(1) + O(log k)) = n * O(log k) = O(n log k) For the second loop, it is bounded by min(n, k) * O(log k) = O(n log k) (we could also reduce it to O(k log k) , because big-O is an upper bound ; O(min(n, k)) is O(n) and O(k) ). 对于第二个循环,它由min(n, k) * O(log k) = O(n log k)界定(我们也可以将其简化为O(k log k) ,因为big-O是一个上限 ; O(min(n, k))O(n) O(k) )。 Overall, the method therefore requires O(n log k) + O(n log k) = O(n log k) operations. 因此,总的来说,该方法需要O(n log k) + O(n log k) = O(n log k)运算。

In addition to that operation, the main method allocates an n -element array ( O(1) ), initializes its members ( O(n) ), and iterates over the results, printing each ( O(k) ). 除了该操作之外,main方法还分配一个n元素数组( O(1) ),初始化其成员( O(n) ),然后遍历结果,并打印每个结果( O(k) )。 None of those costs exceeds that of the getTopElements() method, so that method's cost dominates the overall cost. 这些成本都没有超出getTopElements()方法的成本,因此该方法的成本主导了总成本。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM