简体   繁体   English

获得堆排序版本的最快非破坏性方法?

[英]fastest non-destructive method to get sorted version of a heap?

I have a priority heap holding an event queue.我有一个包含事件队列的优先级堆。

I need to dump this out for the user in order, but without rendering the heap unusable .我需要按顺序为用户转储它,但不会使堆不可用

Obviously if I was willing to destroy it I could simply dequeue events until it was empty, and add them in order to my sorted list.显然,如果我愿意销毁它,我可以简单地将事件出队直到它为空,然后将它们按顺序添加到我的排序列表中。 But then of course the heap is gone.但是当然堆已经消失了。 Further, Quicksort is so much faster than a heap sort that I don't have confidence that I can build this sorted list faster than I can make a copy of the heap and sort the copy.此外,Quicksort 比堆排序快得多,以至于我没有信心构建这个排序列表的速度会比我制作堆副本并对副本排序更快。

One idea I had was to in fact destroy the heap by dequeueing all its items, but then... replacing the now-empty priority queue with the resulting sorted list, which should maintain the heap property (of cell i being a higher priority than cell i * 2+1 and i * 2+2).我有一个想法实际上是通过将所有项目出队来破坏堆,但是然后......用生成的排序列表替换现在空的优先级队列,它应该保持堆属性(单元 i 的优先级高于单元格 i * 2+1 和 i * 2+2)。 So I'm also wondering whether such a heap would perform better than a regular heap.所以我也想知道这样的堆是否会比普通堆性能更好。

The easiest solution is just to copy the heap array, and sort the copy.最简单的解决方案就是复制堆数组,然后对副本进行排序。 But some sorts do a bad job when given sorted or somewhat-sorted data, and I'm wondering whether the Standard C++ library's sort (or C qsort()) could be trusted to handle sorting a heap as efficiently?但是,当给定已排序或已排序的数据时,某些排序会做得不好,我想知道标准 C++ 库的排序(或 C qsort())是否可以信任以有效地处理堆排序?

With the usual heap implementation, just sort the heap in place.使用通常的堆实现,只需对堆进行适当的排序。 A sorted list satisfies the heap condition for a min-heap.排序列表满足最小堆的堆条件。 Alternately if you sort the heap descending, you satisfy the heap condition for a max-heap.或者,如果对堆进行降序排序,则满足最大堆的堆条件。 And you can always sort it one way and traverse another if that is what you need.如果需要,您始终可以按一种方式对其进行排序并遍历另一种方式。

Note that the sort::heap documentation warns about breaking the heap condition.请注意, sort::heap文档警告有关破坏堆条件。 Be careful that you know you haven't if you are changing the heap data in place.如果您正在就地更改堆数据,请注意您知道自己没有。

It looks to me like you're concerned about a performance problem that isn't really a problem.在我看来,您担心的是性能问题,但实际上并不是问题。 As I understand it, modern C++ implementations of sort use Introsort, which avoids the pathological worst-case times of a naïve Quicksort.据我了解,现代 C++ sort实现使用 Introsort,这避免了天真的快速排序的病态最坏情况时间。 And the difference between Quicksort and Heapsort, in the context of generating user output, is not large enough to be a concern.并且在生成用户 output 的上下文中,Quicksort 和 Heapsort 之间的差异不足以引起关注。

Just copy the heap and sort it.只需复制堆并对其进行排序。 Or call sort::heap and output the result, provided of course that doing so doesn't break the heap.或者调用sort::heap和 output 结果,当然前提是这样做不会破坏堆。

You asked if a sorted heap performs better than a non-sorted heap.您询问排序堆是否比未排序堆执行得更好。 No. When adding an item, you still add it as the last node and sift it up.不是的,在添加item的时候,你还是把它添加到最后一个节点,向上筛选。 Half of the nodes in a heap are at the leaf level and assuming a uniform distribution of new items, then half of the items you add will end up at the leaf level, requiring no swaps.堆中的一半节点位于叶级并假设新项目均匀分布,那么您添加的一半项目将最终位于叶级,不需要交换。 Worst case is if every item you add ends up being the smallest (in a min-heap), in which case every time you add an item it will take log(n) swaps to move it to the root.最坏的情况是,如果你添加的每个项目最终都是最小的(在最小堆中),在这种情况下,每次你添加一个项目时,它都会进行 log(n) 次交换以将其移动到根。 Now, if every item added is larger than any other item in the heap, then of course addition is O(1).现在,如果添加的每个项目都大于堆中的任何其他项目,那么加法当然是 O(1)。 But that's true regardless of whether the heap was initially created from a sorted array.但无论堆最初是否是从排序数组创建的,都是如此。

Deleting an item from the heap requires that you replace the root item with an item from the leaf level and then sift it down.从堆中删除一个项目需要您用叶级别的项目替换根项目,然后将其筛选下来。 In a sorted heap, the likelihood that the replacement item will end up back down at the leaf level is very high, which means that adjusting the heap will require the maximum log(n) swaps.在已排序的堆中,替换项最终返回到叶级别的可能性非常高,这意味着调整堆将需要最大的 log(n) 交换。 A sorted heap almost guarantees that removal will require the maximum number of swaps.已排序的堆几乎可以保证移除将需要最大数量的交换。 In this case, a sorted heap is potentially worse in terms of performance than a heap constructed from a randomly-arranged array.在这种情况下,排序堆在性能方面可能从随机排列的数组构造的堆更差。

But all that changes quickly as you begin adding items to and removing items from the heap.但是,当您开始向堆中添加项和从堆中删除项时,所有这些都会很快发生变化。 The heap becomes "not sorted" fairly quickly.堆很快变得“未排序”。

Over the life of the priority queue, it's highly unlikely that the initial order of items will make any noticeable difference in the performance of your binary heap.在优先级队列的整个生命周期中,项目的初始顺序不太可能对二叉堆的性能产生任何显着差异。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM