简体   繁体   English

提取k个最大元素

[英]Extracting k largest elements

If I have n integers, is it possible to list the k largest elements out of the n values in O(k+logn) time? 如果我有n个整数,是否有可能在O(k + logn)时间中列出n个值中的k个最大元素? The closest I've gotten is constructing a max heap and extracting the maximum k times, which takes O(klogn) time. 我得到的最接近的方法是构造一个最大堆并提取最大k次,这需要O(klogn)时间。 Also thinking about using inorder traversal. 还考虑使用有序遍历。

Ways to solve this problem. 解决此问题的方法。

  1. Sort the data, then take top k. 排序数据,然后取前k个。 Sorting takes O(n lg n) and iterating over the top k takes O(k) . 排序需要O(n lg n) ,而对顶部k进行迭代需要O(k) Total time: O(n lg n + k) 总时间: O(n lg n + k)

  2. Build a max-heap from the data and remove the top k times. 根据数据建立最大堆,并删除前k次。 Building the heap is O(n) , and the operation to remove the top item is O(lg N) to reheapify. 建立堆的对象是O(n) ,除去顶部项目的操作是O(lg N)以重新堆砌。 Total time: O(n) + O(k lg n) 总时间: O(n) + O(k lg n)

  3. Keep a running min-heap of maximum size k. 保持运行的最小堆最大大小为k。 Iterate over all the data, add to the heap, and then take the entirety of the heap. 遍历所有数据,添加到堆中,然后使用整个堆。 Total time: O(n lg k) + O(k) 总时间: O(n lg k) + O(k)

  4. Use a selection algorithm to find the k'th largest value. 使用选择算法找到第k个最大值。 Then iterate over all the data to find all items that are larger than that value. 然后遍历所有数据以查找所有大于该值的项目。

    a. 一种。 You can find the k'th largest using QuickSelect which has an average running time of O(n) but a worst case of O(n^2) . 您可以使用QuickSelect找到第k个最大对象 ,它的平均运行时间为O(n)但最差情况为O(n^2) Total average case time: O(n) + O(n) = O(n) . 平均平均案件时间: O(n) + O(n) = O(n) Total worst case time: O(n^2) + O(n) = O(n^2) . 最坏情况总时间: O(n^2) + O(n) = O(n^2)

    b. You can also find the k'th largest using the median-of-medians algorithms which has a worst case running time of O(n) but is not in-place. 您还可以使用中位数算法找到第k个最大算法,该算法的最坏情况下运行时间为O(n)但不在原位。 Total time: O(n) + O(n) = O(n) . 总时间: O(n) + O(n) = O(n)

You can use Divide and Conquer technique for extracting kth element from array.Technique is sometimes called as Quick select because it uses the Idea of Quicksort . 您可以使用分而治之技术提取array.Technique k个元素有时也称为快速选择 ,因为它使用的快速排序的思想。

QuickSort , we pick a pivot element, then move the pivot element to its correct position and partition the array around it. QuickSort ,我们选择一个pivot元素,然后将枢轴元素移动到其正确位置并对其周围的数组进行partition The idea is, not to do complete quicksort , but stop at the point where pivot itself is k'th smallest element. 这个想法是, 不做完整的quicksort ,而是停在枢轴本身是第k个最小元素的位置。 Also, not to recur for both left and right sides of pivot, but recur for one of them according to the position of pivot. 同样,不要针对枢轴的左侧和右侧重复,而是根据枢轴的位置针对其中之一重复。 The worst case time complexity of this method is O(n^2) , but it works in O(n) on average. 该方法最坏的情况是时间复杂度为O(n^2) ,但平均而言,它的工作效率为O(n)

Constructing a heap takes O(nlogn), and extracting k elements takes O(klogn). 构造堆需要O(nlogn),提取k个元素需要O(klogn)。 If you reached the conclusion that extracting k elements is O(klogn), it means you're not worried about the time it takes to build the heap. 如果得出结论,提取k个元素为O(klogn),则意味着您不必担心构建堆所花费的时间。

In that case, just sort the list ( O(nlogn) ) and take the k largest element (O(k)). 在这种情况下,只需对列表(O(nlogn))进行排序并采用k个最大元素(O(k))。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM