简体繁体 English

stdlib 的 qsort 是递归的吗？

[英]Is stdlib's qsort recursive?

原文 2010-07-31 20:43:46 5 9 c/ sorting/ libc/ qsort

I've read that qsort is just a generic sort, with no promises about implementation.我读过qsort只是一种通用排序，没有关于实现的承诺。 I don't know about how libraries vary from platform to plaform, but assuming the Mac OS X and Linux implementations are broadly similar, are the qsort implementations recursive and/or require a lot of stack ?我不知道库如何因平台而异，但假设 Mac OS X 和 Linux 实现大致相似， qsort实现是否递归和/或需要大量堆栈？

I have a large array (hundreds of thousands of elements) and I want to sort it without blowing my stack to oblivion.我有一个大数组（数十万个元素），我想对它进行排序，而不会将我的堆栈吹得一团糟。 Alternatively, any suggestions for an equivalent for large arrays?或者，对于大型阵列的等效项有什么建议吗？

9 个解决方案

Here's a version from BSD, copyright Apple, presumably used in OS X at some time or another:这是 BSD 的一个版本，版权为 Apple，可能在某个时间或其他时间用于 OS X：

http://www.opensource.apple.com/source/xnu/xnu-1456.1.26/bsd/kern/qsort.c http://www.opensource.apple.com/source/xnu/xnu-1456.1.26/bsd/kern/qsort.c

It is call-recursive, although the upper bound on the depth of recursion is small, as Blindy explains.正如 Blindy 解释的那样，尽管递归深度的上限很小，但它是调用递归的。

Here's a version from glibc, presumably used in Linux systems at some time or another:这是 glibc 的一个版本，大概在某个时间用于 Linux 系统：

http://www.umcs.maine.edu/~chaw/200801/capstone/n/qsort.c http://www.umcs.maine.edu/~chaw/200801/capstone/n/qsort.c

It's not call recursive.这不是递归调用。 For exactly the same reason that the limit on call-recursion is small, it can use a small fixed amount of stack to manage its loop-recursion.出于对调用递归的限制很小的完全相同的原因，它可以使用少量固定数量的堆栈来管理其循环递归。

Can I be bothered to look up the latest versions?我可以费心查找最新版本吗？ Nope ;-)不 ;-）

For a few hundred thousand array elements, even the call-recursive implementation won't call more than 20 levels deep.对于几十万个数组元素，即使是调用递归实现也不会调用超过 20 级的深度。 In the grand scheme of things that is not deep, except on very limited embedded devices, which wouldn't have enough memory for you to have an array that big to sort in the first place.在不深的宏伟计划中，除了在非常有限的嵌入式设备上，它们没有足够的内存让您首先拥有一个大数组进行排序。 When N is bounded above, O(log N) obviously is a constant , but more than that it's normally quite a manageable constant.当 N 有界时， O(log N) 显然是一个常数，但更重要的是，它通常是一个非常易于管理的常数。 Usually 32 or 64 times "small" is "reasonable".通常32或64倍的“小”是“合理的”。

You know, the recursive part is logn deep.你知道，递归部分是 logn deep。 In 64 levels of recursion (which is ~64*4=~256 bytes of stack total) you can sort an array of size ~2^64, ie an array as large as you can address on a 64 bit cpu, which is 147573952589676412928 bytes for 64 bit integers.在 64 级递归中（总堆栈为 ~64*4=~256 字节），您可以对大小为 ~2^64 的数组进行排序，即在 64 位 CPU 上可以寻址的最大数组，即 147573952589676412928 64 位整数的字节。 You can't even hold it in memory!你甚至不能把它放在记忆中！

Worry about stuff that matters imo.担心重要的事情。

Yes it's recursive.是的，它是递归的。 No, it probably will not use large amounts of stack.不，它可能不会使用大量堆栈。 Why not simply try it?为什么不简单地尝试一下呢？ Recursion is not some kind of bogey - it's the solution of choice for very many problems.递归不是某种忌讳——它是许多问题的首选解决方案。

A properly implemented qsort does not require more than log2(N) levels of recursion (ie depth of stack), where N is the largest array size on the given platform.正确实现的qsort不需要超过 log2(N) 级的递归（即堆栈深度），其中 N 是给定平台上的最大数组大小。 Note that this limit applies regardless of how good or bad the partitioning happens to be, ie it is the worst case depth of recursion.请注意，无论分区的好坏如何，此限制都适用，即它是递归的最坏情况深度。 For example, on a 32-bit platform, the depth of recursion will never exceed 32 in the worst possible case, given a sane implementation of qsort .例如，在 32 位平台上，在最坏的情况下，递归的深度永远不会超过 32， qsort是qsort的合理实现。

In other words, if you are concerned about the stack usage specifically, you have nothing to worry about, unless you are dealing with some strange low-quality implementation.换句话说，如果您特别关注堆栈使用情况，则无需担心，除非您正在处理一些奇怪的低质量实现。

I remember reading in this book: C Programming: A Modern Approach that the ANSI C specification doesn't define how to implement qsort.我记得在这本书中读到过： C 编程：一种现代方法，ANSI C 规范没有定义如何实现 qsort。

And the book wrote that qsort could in reality be a another kind of sort, merge sort, insertion sort and why not bubble sort :P这本书写道， qsort实际上可以是另一种排序、归并排序、插入排序，为什么不是冒泡排序：P

So, the qsort implementation might not be recursive.因此， qsort实现可能不是递归的。

With quicksort, the stack will grow logarithmically.使用快速排序，堆栈将以对数方式增长。 You will need a lot of elements to blow up your stack.您将需要大量元素来炸毁您的堆栈。

I'd guess that most modern implementations of qsort actually use the Introsort algorithm.我猜想qsort大多数现代实现实际上都使用 Introsort 算法。 A reasonably written Quicksort won't blow the stack anyway (it'll sort the smaller partition first, which limits stack depth to logarithmic growth).编写合理的 Quicksort 无论如何都不会破坏堆栈（它将首先对较小的分区进行排序，这将堆栈深度限制为对数增长）。

Introsort goes a step further though -- to limit the worst case complexity, if it sees that Quicksort isn't working well (too much recursion, so it could have O(N ² ) complexity), it'll switch to a Heapsort which guarantees O(N log ₂ N) complexity and limits stack usage as well.不过，Introsort 更进了一步——为了限制最坏情况的复杂性，如果它发现 Quicksort 不能正常工作（递归太多，所以它可能有 O(N ² ) 复杂性），它会切换到堆排序保证 O(N log ₂ N) 复杂度并限制堆栈使用。 Therefore, even if the Quicksort it uses is sloppily written, the switch to Heapsort will limit stack usage anyway.因此，即使它使用的 Quicksort 写得很草率，切换到 Heapsort 无论如何都会限制堆栈的使用。

A qsort implementation which can fail on large arrays is extremely broken.在大型阵列上可能会失败的qsort实现非常糟糕。 If you're really worried I'd go RTFS, but I suspect any half-decent implementation will either use an in-place sorting algorithm or use malloc for temporary space and fall back to an in-place algorithm if malloc fails.如果你真的很担心我会去 RTFS，但我怀疑任何半体面的实现要么使用就地排序算法，要么使用malloc作为临时空间，如果malloc失败，则回退到就地算法。

The worst-case space-complexity of a naive quicksort implementation (which is still a popular option for qsort) is O(N).朴素的快速排序实现（这仍然是 qsort 的流行选项）的最坏情况空间复杂度是 O(N)。 If the implementation is modified to sort the smaller arary first and tail-recursion optimisation or an explicit stack and iteration is used then the worst case space can be brought down to O(log N), (what most answers here wrote already).如果修改实现以先对较小的数组进行排序，然后使用尾递归优化或显式堆栈和迭代，那么最坏情况空间可以降低到 O(log N)，（这里的大多数答案已经写好了）。 So, you will not blow up your stack if the implementation of quick-sort is not broken and the library was not broken by improper compiler flags.因此，如果快速排序的实现没有被破坏并且库没有被不正确的编译器标志破坏，你就不会炸毁你的堆栈。 But, for example, most compiler which support tail recursion elimination won't do this optimization it in unoptimized debug builds.但是，例如，大多数支持尾递归消除的编译器不会在未优化的调试版本中对其进行优化。 A library built with the wrong flags (ie not enough optimization, for example in the embedded domain where you sometimes build your own debug libc) might crash the stack then.使用错误标志构建的库（即优化不足，例如在您有时构建自己的调试 libc 的嵌入式域中）可能会导致堆栈崩溃。

For most developers, this will never be an issue (they have vendor tested libc's which have O(log N) space complexity), but I'd say it is a good idea to have an eye on potential library issues from time to time.对于大多数开发人员来说，这永远不会成为问题（他们已经供应商测试了具有 O(log N) 空间复杂度的 libc），但我认为不时关注潜在的库问题是个好主意。

UPDATE: Here's an example for what I mean: A bug in libc (from 2000) where qsort would start thrashing virtual memory because the qsort implementation would switch internally to mergesort because it though there is enough memory to hold a temporary array.更新：这是我的意思的一个例子：libc（从 2000 年开始）中的一个错误，其中 qsort 将开始抖动虚拟内存，因为 qsort 实现将在内部切换到合并排序，因为它虽然有足够的内存来保存临时数组。

http://sources.redhat.com/ml/libc-alpha/2000-03/msg00139.html http://sources.redhat.com/ml/libc-alpha/2000-03/msg00139.html