[英]C++ std::sort implementation
I am wondering as to the implementation of std::sort
in c++11
. 我想知道在
c++11
中std::sort
的实现。 I have an MPI
-managed parallel code, where each rank reads data from a file into a vector A
that needs to be sorted. 我有一个
MPI
管理的并行代码,其中每个等级将文件中的数据读取到需要排序的向量A
中。 Each rank does calls std::sort
to do this. 每个等级都调用
std::sort
来执行此操作。
When I run this with ~100 ranks, there is sometimes one rank which hangs at this call to std::sort
. 当我用〜100等级运行此命令时,有时在调用
std::sort
会挂一个等级。 Eventually, I realized, it's not hanging, the sort just takes very long. 最终,我意识到它没有挂起,只是花了很长时间。 That is, one rank will take ~200 times longer to sort than all of the others.
也就是说,一个等级的排序时间要比其他等级长200倍左右。
At first I suspected it was a load-balancing issue. 起初,我怀疑这是一个负载平衡问题。 Nope, I've checked thoroughly that the size of
A
per rank is as balanced as possible. 不,我已经彻底检查了每个等级的
A
大小是否尽可能平衡。
I've concluded that it may just simply be that one rank has an initial condition of A
such that something like the worst-case performance of quicksort is realized (or at least a non-ideal-case). 我已经得出结论,这可能只是一个等级的初始条件为
A
,从而实现了快速排序的最坏情况 (或至少是非理想情况)之类的事情。
Why do I think this? 我为什么这么认为呢?
MPI
configuration (thereby perturbing the content of A
per rank, since it comes from a file read), the issue disappears, or it can move to other ranks. MPI
配置(由于每个等级A
的内容都来自读取的文件,因此它会受到干扰),问题就会消失,或者它可能会转移到其他等级。 std::sort
to std::stable_sort
(no longer using the quicksort algorithm), then all is fine. std::sort
更改为std::stable_sort
(不再使用quicksort算法),那么一切都很好。 However, it seems that it would be most sensible to implement a quicksort by choosing a random pivot point on each iteration. 但是,似乎似乎最明智的做法是通过在每次迭代中选择一个随机枢轴点来实现快速排序。 If that were the case with
std::sort
, then it would be overwhelmingly unlikely to choose a worst-case value randomly from A
on many iterations (which would be required to result in a 200x performance hit). 如果
std::sort
是这种情况,那么在许多次迭代中从A
随机选择一个最坏情况的值(这将导致200倍的性能下降)是绝对不可能的。
Thus, my observations suggest that std::sort
implements a fixed quicksort pivot value (eg always choose the first value in the array, or something like that). 因此,我的观察结果表明
std::sort
实现了固定的 quicksort枢轴值(例如,始终选择数组中的第一个值或类似的值)。 This is the only way that the behavior I'm seeing would be likely, and also give consistent results when re-running on the same MPI
configuration (which it does). 这是我所看到的行为唯一可能的方式,并且在相同的
MPI
配置上重新运行时,它也会给出一致的结果(确实如此)。
Am I correct in that conclusion? 我的结论正确吗? I did manage to find the
std
source, but the sort
function is totally unreadable, and makes a plethora of calls to various helper functions, and I'd rather avoid a rabbit hole. 我确实设法找到了
std
源,但是sort
函数是完全不可读的,并且对各种辅助函数进行了大量调用,所以我宁愿避开兔子洞。 Aside from that, I'm running on an HPC system, and it's not even clear to me how to be sure what exactly mpicxx
is linking to. 除此之外,我正在HPC系统上运行,我什至还不清楚如何确定
mpicxx
到底链接到什么。 I can't find any documentation which describe the algorithm implementation 我找不到任何描述算法实现的文档
std::sort
is implementation specific. std::sort
是特定于实现的。
And since C++11, regular quicksort is no longer a valid implementation as required complexity move from O(N log N)
on average to O(N log N)
. 而且,由于C ++ 11,普通快速排序不再是一个有效的实现从所需的复杂性举动
O(N log N)
上的平均 O(N log N)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.