排序数组的QuickSort堆栈溢出（适用于其他数据集）

Question

So I tried my best to optimize my Quicksort algorithm to run as efficiently as possible, even for sorted or nearly sorted arrays, using a pivot that is the median of three values, and also using insertion sort for small partition sizes. 因此，我尽力优化我的Quicksort算法以尽可能高效地运行，即使对于已排序或接近排序的数组，使用三个值的中间值的枢轴，以及对小分区大小使用插入排序。 I have tested my code for large arrays of random values and it works, but when I pass an already sorted array I get a stack overflow error(ironic because it led me to find this website). 我已经测试了我的代码用于大型随机值数组并且它可以工作，但是当我传递已经排序的数组时，我得到了一个堆栈溢出错误（具有讽刺意味的是它导致我找到了这个网站）。 I believe this to be a problem with my recursive calls(I know that the partitioning works for other data sets at least), but I can't quite see what to change. 我认为这是我的递归调用的一个问题（我知道分区至少适用于其他数据集），但我不知道要改变什么。

This is part of my first semester data structures class so any code review will help as well. 这是我第一学期数据结构课程的一部分，因此任何代码审查也会有所帮助。 Thanks. 谢谢。

public void quickSort(ArrayList<String> data, int firstIndex, int numberToSort) {
    if (firstIndex < (firstIndex + numberToSort - 1))
        if (numberToSort < 16) {
            insertionSort(data, firstIndex, numberToSort);
        } else {
            int pivot = partition(data, firstIndex, numberToSort);
            int leftSegmentSize = pivot - firstIndex;
            int rightSegmentSize = numberToSort - leftSegmentSize - 1;
            quickSort(data, firstIndex, leftSegmentSize);
            quickSort(data, pivot + 1, rightSegmentSize);
        }
}



public int partition(ArrayList<String> data, int firstIndex, int numberToPartition) {
    int tooBigNdx = firstIndex + 1;
    int tooSmallNdx = firstIndex + numberToPartition - 1;

    String string1 = data.get(firstIndex);
    String string2 = data.get((firstIndex + (numberToPartition - 1)) / 2);
    String string3 = data.get(firstIndex + numberToPartition - 1);
    ArrayList<String> randomStrings = new ArrayList<String>();
    randomStrings.add(string1);
    randomStrings.add(string2);
    randomStrings.add(string3);
    Collections.sort(randomStrings);
    String pivot = randomStrings.get(1);
    if (pivot == string2) {
        Collections.swap(data, firstIndex, (firstIndex + (numberToPartition - 1)) / 2);
    }
    if (pivot == string3) {
        Collections.swap(data, firstIndex, firstIndex + numberToPartition - 1);
    }
    while (tooBigNdx < tooSmallNdx) {
        while ((tooBigNdx < tooSmallNdx) && (data.get(tooBigNdx).compareTo(pivot) <= 0)) {
            tooBigNdx++;
        }
        while ((tooSmallNdx > firstIndex) && (data.get(tooSmallNdx).compareTo(pivot) > 0)) {
            tooSmallNdx--;
        }
        if (tooBigNdx < tooSmallNdx) {// swap
            Collections.swap(data, tooSmallNdx, tooBigNdx);
        }
    }
    if (pivot.compareTo(data.get(tooSmallNdx)) >= 0) {
        Collections.swap(data, firstIndex, tooSmallNdx);
        return tooSmallNdx;
    } else {
        return firstIndex;
    }
}

Answer 1

You can avoid stack overflows without changing your algorithm too much. 您可以避免堆栈溢出而不会过多地更改算法。 The trick is to tail-call optimize on the largest partition and only use recursion on the smallest one. 诀窍是在最大的分区上进行尾部调用优化，并且只在最小的分区上使用递归。 This usually means your have to change your if to a while . 这通常意味着你必须对你的变化if到while 。 I can't really test java code right now, but it should look something like: 我现在无法真正测试java代码，但它应该类似于：

public void quickSort(ArrayList<String> data, int firstIndex, int numberToSort) {
    while (firstIndex < (firstIndex + numberToSort - 1))
        if (numberToSort < 16) {
            insertionSort(data, firstIndex, numberToSort);
        } else {
            int pivot = partition(data, firstIndex, numberToSort);
            int leftSegmentSize = pivot - firstIndex;
            int rightSegmentSize = numberToSort - leftSegmentSize - 1;

            //only use recursion for the smallest partition
            if (leftSegmentSize < rightSegmentSize) {
                quickSort(data, firstIndex, leftSegmentSize);
                firstIndex = pivot + 1;
                numberToSort = rightSegmentSize;
            } else {
                quickSort(data, pivot + 1, rightSegmentSize);
                numberToSort = leftSegmentSize;
            }
        }
}

This ensures that the call stack size will be at most O(log n) , because on each call you only use recursion on an array of at most n/2 size. 这可以确保调用堆栈大小最多为O(log n) ，因为在每次调用时，您只对最多n/2大小的数组使用递归。

Answer 2

In your partition method you sometimes use a element outside the range: 在partition方法中，有时使用范围之外的元素：

String string1 = data.get(firstIndex);
String string2 = data.get((firstIndex + (numberToPartition - 1)) / 2);
String string3 = data.get(firstIndex + numberToPartition - 1);

(firstIndex + (numberToPartition - 1)) / 2 is not index of the middle element. (firstIndex + (numberToPartition - 1)) / 2不是中间元素的索引。 That would be (firstIndex + (firstIndex + (numberToPartition - 1))) / 2 那将是(firstIndex + (firstIndex + (numberToPartition - 1))) / 2

= firstIndex + ((numberToPartition - 1) / 2) . = firstIndex + ((numberToPartition - 1) / 2) 。

In fact if firstIndex > n/2 (where n is the number of elements in the input) you're using a element with an index smaller than firstIndex . 实际上，如果firstIndex > n/2 （其中n是输入中元素的数量），则使用索引小于firstIndex的元素。 For sorted arrays that means you choose the element at firstIndex as pivot element. 对于排序数组，这意味着您在firstIndex选择元素作为pivot元素。 Therefore you get a recursion depth in 因此，您将获得递归深度

<代码>欧米茄（n）的</代码> , ，

which causes the stack overflow for large enough inputs. 这导致堆栈溢出足够大的输入。

排序数组的QuickSort堆栈溢出（适用于其他数据集）

问题描述

2 个解决方案

解决方案1
2 2015-04-25 01:07:48

解决方案2
1 已采纳 2015-04-25 00:22:48

排序数组的QuickSort堆栈溢出（适用于其他数据集）

问题描述

2 个解决方案

解决方案1 2 2015-04-25 01:07:48

解决方案2 1 已采纳 2015-04-25 00:22:48

解决方案1
2 2015-04-25 01:07:48

解决方案2
1 已采纳 2015-04-25 00:22:48