为什么std :: sort（）比std :: make_heap（）更快？

Question

I have 13721057 elements in my std::vector<Sequence> . 我的std::vector<Sequence>有13721057元素。 I need to sort this vector and grab the first 25 elements. 我需要对这个向量进行排序并获取前25个元素。 I thought, since you can build a heap in O(N) it must be faster to pop 25 elements (each being O(logN) ) than sorting the whole vector in O(NlogN) . 我想，既然你可以在O(N)构建一个堆，那么弹出25个元素（每个都是O(logN) ）比在O(NlogN)对整个向量进行排序必须更快。

However, when I time the code: 但是，当我计算代码时：

clock_t tStart = clock();
sort(mostFrequent.begin(), mostFrequent.end(), greater<Sequence>());
printf("Time taken: %.2fs\n", (double)(clock() - tStart)/CLOCKS_PER_SEC);

vs. 与

clock_t tStart = clock();
make_heap(mostFrequent.begin(), mostFrequent.end());
printf("Time taken: %.2fs\n", (double)(clock() - tStart)/CLOCKS_PER_SEC);

It appears to be much faster to sort the whole vector. 对整个矢量进行排序似乎要快得多。 Why is this? 为什么是这样？

Answer 1

This is not a full answer but to get the first 25 elements out of 13721057 you better use partial_sort . 这不是一个完整的答案，但要获得13721057中的前25个元素，最好使用partial_sort 。

If you only need the 25th element, then nth_element . 如果你只需要第25个元素，那么nth_element 。

As a side note. 作为旁注。 For getting the first elements less than X in sorted order, I would do auto mid = std::partition with a lambda, and then std::sort(begin,mid) . 为了按排序顺序获得小于X的第一个元素，我将使用lambda执行auto mid = std::partition ，然后使用std::sort(begin,mid) 。 There may be a better way. 可能有更好的方法。

Answer 2

EDIT: As suggested in a comment I also tried with a pre-sorted input and in that case I did manage to get sort faster than make_heap for my "expensive to copy" type, but only by a small margin around 5-10%. 编辑：正如评论中所建议的那样，我也尝试使用预先排序的输入，在这种情况下，我确实设法比make_heap更快地排序，因为我的“昂贵复制”类型，但只有5-10％左右的小幅度。

No matter what I try, I am unable to reproduce your results on either Solaris or Linux (gcc 4.4). 无论我尝试什么，我都无法在Solaris或Linux上重现您的结果（gcc 4.4）。 make_heap has always come out on the order of 1/3rd the time spent. make_heap总是花费大约1/3的时间。

No optimization vs -O3 only changes total time, not relative order. 没有优化vs -O3只改变总时间，而不是相对顺序。
I used your exact number of items. 我用了你确切的物品数量。
First tried sorting int then a larger "expensive to copy" class. 首先尝试排序int然后更大的“昂贵复制”类。
Guessed what includes you were using. 猜到了你正在使用的内容。
Moved timing calls outside the printf to make sure they were always ordered properly. 将定时调用移到printf外部以确保它们始终正确排序。

I assume that the actual reason for this discrepancy is that either your < and > operators aren't the same complexity or that copying your object is somehow expensive relative to comparing it in a way my test was unable to duplicate. 我假设这种差异的实际原因是你的<和>运算符不是相同的复杂性，或者复制你的对象相对于我的测试无法复制的方式比较它有点昂贵。

为什么std :: sort（）比std :: make_heap（）更快？

问题描述

2 个解决方案

解决方案1
12 2016-01-19 21:07:10

解决方案2
9 2016-01-19 21:31:25

为什么std :: sort（）比std :: make_heap（）更快？

问题描述

2 个解决方案

解决方案1 12 2016-01-19 21:07:10

解决方案2 9 2016-01-19 21:31:25

解决方案1
12 2016-01-19 21:07:10

解决方案2
9 2016-01-19 21:31:25