简体繁体 English

壳排序算法比合并排序算法有什么优势？

[英]How is shell sort algorithm any better than of merge sort algorithm?

原文 2015-10-01 12:19:16 5 3 java/ algorithm/ shell/ sorting/ merge

I have this small presentation about algorithms with a group of nerds and i was randomly tasked to convince them that shell sort is better than merge sort algorithm... I have been reading for almost a weak But No matter how much i read on merge sort and shell sort i find the merge sort better than shell sort.. 我有一个关于这个与一群书呆子有关的算法的小型演示文稿，我被随机委以说服他们说shell排序比合并排序算法要好...我阅读的内容几乎很弱，但是无论我在合并排序上读了多少和外壳排序，我发现合并排序比外壳排序更好。

Are their any advantages of shell sort on merge sort? 它们对合并排序有什么好处吗？ I mean on what circumstances is shell sort better than merge sort. 我的意思是在什么情况下shell排序比合并排序更好。 I might have missed something but i dont know what. 我可能已经错过了一些东西，但我不知道是什么。

Any tips would be fine or if possible can you link me to something helpful.. 任何提示都可以，或者如果可以的话，您可以将我链接到一些有用的信息。

3 个解决方案

You have to remember the context in which shellsort was proposed: shellsort was published in 1959; 您必须记住提出shellsort的背景：shellsort于1959年出版； quicksort, in 1961; 快速排序，1961年； mergesort, in 1948 (OK, that was a bit surprising). mergesort，1948年（好吧，这有点令人惊讶）。 The computers of the day were slow and had small memories. 那天的电脑很慢，记忆很小。 Thus the asymptotic advantage of mergesort was hardly relevant compared to the increased complexity of implemention and code. 因此，与实现和代码的增加的复杂性相比，mergesort的渐近优势几乎不相关。 In fact, shellsort gets the quadratic fallback of modern practical mergesorts for free, since insertion sorting with a gap of 1 is insertion sort. 实际上，由于排序间隔为1的插入排序是插入排序，因此shellsort是免费获得现代实用mergesorts的二次降级的。

It was not known then how to do an efficient in-place merge (and even now, no one implements it, because it's wildly inefficient in practice). 当时还不知道如何进行有效的就地合并（甚至现在还没有人实现它，因为在实践中效率很低）。

Shellsort has an uncomplicated nonrecursive implementation. Shellsort具有简单的非递归实现。 Recursion in higher-level languages was confined to LISP (impractical then, not to mention lacking an array type) and the as-yet unimplemented ALGOL 60 standard. 高级语言中的递归仅限于LISP（那时不切实际，更不用说缺少数组类型）和尚未实现的ALGOL 60标准。

Shellsort's running time improves a lot on mostly sorted data. Shellsort的运行时间大大改善了大多数已排序的数据。 (It's no Timsort though.) （虽然不是Timsort。）

Merge sort is normally faster than shell sort, but shell sort is in place. 合并排序通常比shell排序快，但是有shell排序。 Quick sort is faster if sorting data, but merge sort is usually faster if sorting an array of pointers or indices to data, if the compare overhead for the elements is greater than the move overhead for pointers or indices, since merge sort uses fewer compares but more moves than quick sort. 如果对数据进行排序，则快速排序会更快，但是，如果对元素的比较开销大于对指针或索引的移动开销，则对数据的指针或索引数组进行排序时，合并排序通常会更快，因为合并排序使用的比较少，比快速排序更多的动作。 If sorting an array of somewhat random integers, then counting / radix sort is fastest. 如果对一些随机整数数组进行排序，则计数/基数排序最快。

As mentioned, merge sort was published in 1948. Merge sort on old mainframes was implemented on tape drivers or disk drives. 如前所述，合并排序于1948年发布。旧大型机上的合并排序是在磁带驱动器或磁盘驱动器上实现的。 For tape drives, there were/are variations of merge sort: 对于磁带驱动器，有/有多种合并排序方式：

http://en.wikipedia.org/wiki/Polyphase_merge_sort http://en.wikipedia.org/wiki/Polyphase_merge_sort

http://en.wikipedia.org/wiki/Oscillating_merge_sort http://en.wikipedia.org/wiki/Oscillating_merge_sort

Natural merge sort takes advantages of any existing natural ordering, but has the overhead of keeping track of variable size runs. 自然合并排序可利用任何现有自然排序的优势，但具有跟踪可变大小运行的开销。 With tape drives, this can/could be done using single file marks for end of runs, double file marks for end of data. 对于磁带驱动器，可以/可以使用单个文件标记来表示运行结束，而使用双文件标记来表示数据结束。 Early disk drives with variable sized blocks could implement something similar (using small blocks to indicate end of run / end of data). 早期具有可变大小块的磁盘驱动器可以实现类似的功能（使用小块指示运行结束/数据结束）。

http://en.wikipedia.org/wiki/Merge_sort#Natural_merge_sort http://en.wikipedia.org/wiki/Merge_sort#Natural_merge_sort

An alternative to natural merge sort is tim sort, where natural and/or forced ordering using insertion sort is used to create runs of fixed size during the initial pass: 自然合并排序的一种替代方法是tim排序，其中使用插入排序的自然排序和/或强制排序用于在初始遍历期间创建固定大小的运行：

http://en.wikipedia.org/wiki/Timsort http://en.wikipedia.org/wiki/Timsort

The "classic" merge sort is bottom up merge sort, and in the case of an external sort, using tape drives or disk drives, the initial pass sorts data in memory, to skip past the initial merge passes, similar to tim sort, except that the memory sort may not have been insertion sort, and generally an array of pointers or indices were sorted, and the data written according to those pointers or indices, as opposed to sorting data in memory before writing. “经典”合并排序是自下而上的合并排序，对于外部排序，使用磁带驱动器或磁盘驱动器，初始阶段对内存中的数据进行排序，以跳过初始合并阶段，类似于tim sort，但内存排序可能不是插入排序，通常对指针或索引数组进行排序，并根据这些指针或索引写入数据，这与在写入之前对内存中的数据进行排序相反。 On some systems, a single I/O with multiple pointers / lengths to data is/was used. 在某些系统上，使用/具有数据的多个指针/长度的单个I / O。 SATA / IDE / SCSI PC controllers have a set of descriptors that hold address / length data to deal with paged memory, but I don't know if any high end sort programs for PC's use the descriptors to write a set of records for merge sort with a single I/O. SATA / IDE / SCSI PC控制器具有一组描述符，用于保存地址/长度数据以处理分页内存，但是我不知道是否有用于PC的高端排序程序使用描述符编写一组记录以进行合并排序具有单个I / O。

I'm not sure when top down merge sort was first published. 我不确定何时自上而下的合并排序首次发布。 Rather than starting off with some fixed or variable run size and using iteration to advance indices or pointers while merging runs, it recursively generates indices or pointers until they represent some small fixed run size, typically a run size of 1, and only then does any actual merging of data take place. 它不是以固定或可变的运行大小开始，而是在合并运行时使用迭代来推进索引或指针，而是递归地生成索引或指针，直到它们代表某个较小的固定运行大小，通常运行大小为1，然后才执行任何操作实际发生数据合并。 Whatever advantage there might be due to cache localization of a depth first / left first ordering of run merges, it is offset by the overhead of recursion, and generally top down merge sort is slightly slower (about 5%) than bottom up merge sort). 缓存的深度优先（运行合并的优先顺序）按深度优先/左优先顺序进行排序，这可能有什么优势，但递归的开销会抵消它的位置，通常，自上而下的合并排序比自下而上的合并排序稍慢（大约5％）。。

It depends on your definition of "better", but if you look at a commonly used metric - worst case performance - merge sort ( O(n log n) ) is actually faster for large lists than shell sort ( O(n^2) ). 这取决于您对“更好”的定义，但是如果您查看常用的度量标准（ 最坏情况下的性能 ），则对于大型列表而言，合并排序（ O(n log n) ）实际上比外壳排序（ O(n^2) ）。

In terms of space complexity , both can be implemented in-place , so there's no advantage here for shell sort either. 就空间复杂度而言 ，两者都可以就地实现，因此，对于shell排序也没有任何好处。