简体   繁体   中英

Can quicksort be implemented in C without stack and recursion?

I found this post How to do iterative quicksort without using stack in c? but the answer suggested does use a inline stack array! (Only constant amount of extra space is permitted)

The code in the page in reference makes a bold claim:

STACK My implementation does not use the stack to store data...

Yet the function definition has many variables with automatic storage, among them 2 arrays with 1000 entries, which will end up using a fixed but substantial amount of stack space:

//  quickSort
//
//  This public-domain C implementation by Darel Rex Finley.
//
//  * Returns YES if sort was successful, or NO if the nested
//    pivots went too deep, in which case your array will have
//    been re-ordered, but probably not sorted correctly.
//
//  * This function assumes it is called with valid parameters.
//
//  * Example calls:
//    quickSort(&myArray[0],5); // sorts elements 0, 1, 2, 3, and 4
//    quickSort(&myArray[3],5); // sorts elements 3, 4, 5, 6, and 7

bool quickSort(int *arr, int elements) {

  #define  MAX_LEVELS  1000

  int  piv, beg[MAX_LEVELS], end[MAX_LEVELS], i=0, L, R ;

  beg[0]=0; end[0]=elements;
  while (i>=0) {
    L=beg[i]; R=end[i]-1;
    if (L<R) {
      piv=arr[L]; if (i==MAX_LEVELS-1) return NO;
      while (L<R) {
        while (arr[R]>=piv && L<R) R--; if (L<R) arr[L++]=arr[R];
        while (arr[L]<=piv && L<R) L++; if (L<R) arr[R--]=arr[L]; }
      arr[L]=piv; beg[i+1]=L+1; end[i+1]=end[i]; end[i++]=L; }
    else {
      i--; }}
  return YES; }

The indentation style is very confusing. Here is a reformatted version:

#define MAX_LEVELS  1000

bool quickSort(int *arr, int elements) {
    int piv, beg[MAX_LEVELS], end[MAX_LEVELS], i = 0, L, R;

    beg[0] = 0;
    end[0] = elements;
    while (i >= 0) {
        L = beg[i];
        R = end[i] - 1;
        if (L < R) {
            piv = arr[L];
            if (i == MAX_LEVELS - 1)
                return NO;
            while (L < R) {
                while (arr[R] >= piv && L < R)
                    R--;
                if (L < R)
                    arr[L++] = arr[R];
                while (arr[L] <= piv && L < R)
                    L++;
                if (L < R)
                    arr[R--] = arr[L];
            }
            arr[L] = piv;
            beg[i + 1] = L + 1;
            end[i + 1] = end[i];
            end[i++] = L;
        } else {
            i--;
        }
    }
    return YES;
}

Note that 1000 is large but not sufficient for pathological cases on moderately large arrays that are already sorted. The function returns NO on such arrays with a size of 1000 only, which is unacceptable.

A much lower value would suffice with an improved version of the algorithm where the larger range is pushed into the array and the loop iterates on the smaller range. This ensures that an array of N entries can handle a set of 2 N entries. It still has quadratic time complexity on sorted arrays but at least would sort arrays of all possible sizes.

Here is a modified and instrumented version:

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

#define MAX_LEVELS  64

int quickSort(int *arr, size_t elements) {
    size_t beg[MAX_LEVELS], end[MAX_LEVELS], L, R;
    int i = 0;

    beg[0] = 0;
    end[0] = elements;
    while (i >= 0) {
        L = beg[i];
        R = end[i];
        if (L + 1 < R--) {
            int piv = arr[L];
            if (i == MAX_LEVELS - 1)
                return -1;
            while (L < R) {
                while (arr[R] >= piv && L < R)
                    R--;
                if (L < R)
                    arr[L++] = arr[R];
                while (arr[L] <= piv && L < R)
                    L++;
                if (L < R)
                    arr[R--] = arr[L];
            }
            arr[L] = piv;
            if (L - beg[i] > end[i] - R) { 
                beg[i + 1] = L + 1;
                end[i + 1] = end[i];
                end[i++] = L;
            } else {
                beg[i + 1] = beg[i];
                end[i + 1] = L;
                beg[i++] = L + 1;
            }
        } else {
            i--;
        }
    }
    return 0;
}

int testsort(int *a, size_t size, const char *desc) {
    clock_t t = clock();
    size_t i;

    if (quickSort(a, size)) {
        printf("%s: quickSort failure\n", desc);
        return 1;
    }
    for (i = 1; i < size; i++) {
        if (a[i - 1] > a[i]) {
            printf("%s: sorting error: a[%zu]=%d > a[%zu]=%d\n",
                   desc, i - 1, a[i - 1], i, a[i]);
            return 2;
        }
    }
    t = clock() - t;
    printf("%s: %zu elements sorted in %.3fms\n",
           desc, size, t * 1000.0 / CLOCKS_PER_SEC);
    return 0;
}

int main(int argc, char *argv[]) {
    size_t i, size = argc > 1 ? strtoull(argv[1], NULL, 0) : 1000;
    int *a = malloc(sizeof(*a) * size);
    if (a != NULL) {
        for (i = 0; i < size; i++)
            a[i] = rand();
        testsort(a, size, "random");
        for (i = 0; i < size; i++)
            a[i] = i;
        testsort(a, size, "sorted");
        for (i = 0; i < size; i++)
            a[i] = size - i;
        testsort(a, size, "reverse sorted");
        for (i = 0; i < size; i++)
            a[i] = 0;
        testsort(a, size, "constant");
        free(a);
    }
    return 0;
}

Output:

random: 100000 elements sorted in 7.379ms
sorted: 100000 elements sorted in 2799.752ms
reverse sorted: 100000 elements sorted in 2768.844ms
constant: 100000 elements sorted in 2786.612ms

Here is a slighlty modified version more resistant to pathological cases:

#define MAX_LEVELS  48

int quickSort(int *arr, size_t elements) {
    size_t beg[MAX_LEVELS], end[MAX_LEVELS], L, R;
    int i = 0;

    beg[0] = 0;
    end[0] = elements;
    while (i >= 0) {
        L = beg[i];
        R = end[i];
        if (R - L > 1) {
            size_t M = L + ((R - L) >> 1);
            int piv = arr[M];
            arr[M] = arr[L];

            if (i == MAX_LEVELS - 1)
                return -1;
            R--;
            while (L < R) {
                while (arr[R] >= piv && L < R)
                    R--;
                if (L < R)
                    arr[L++] = arr[R];
                while (arr[L] <= piv && L < R)
                    L++;
                if (L < R)
                    arr[R--] = arr[L];
            }
            arr[L] = piv;
            M = L + 1;
            while (L > beg[i] && arr[L - 1] == piv)
                L--;
            while (M < end[i] && arr[M] == piv)
                M++;
            if (L - beg[i] > end[i] - M) {
                beg[i + 1] = M;
                end[i + 1] = end[i];
                end[i++] = L;
            } else {
                beg[i + 1] = beg[i];
                end[i + 1] = L;
                beg[i++] = M;
            }
        } else {
            i--;
        }
    }
    return 0;
}

Output:

random: 10000000 elements sorted in 963.973ms
sorted: 10000000 elements sorted in 167.621ms
reverse sorted: 10000000 elements sorted in 167.375ms
constant: 10000000 elements sorted in 9.335ms

As a conclusion:

  • yes quick sort can be implemented without recursion,
  • no it cannot be implemented without any local automatic storage,
  • yes only a constant amount of extra space is necessary, but only because we live is a small world where the maximum size of the array is bounded by available memory. A size of 64 for the local objects handles arrays larger than the size of the Internet, much larger than current 64-bit systems could address.

Apparently, it is possible to implement a non-recursive quicksort with only constant amount of extra space as stated here . This builds upon the Sedgewick's work for non-recursive formulation of quicksort. Instead of preserving the boundary values(low and high) it essentially performs a linear scan to determine these bounds.

Well, it can, because I implemented a quicksort in fortran IV (it was a long time ago, and before the language supported recursion - and it was for a bet). However you do need somewhere (a large array would do) to remember your state as you do individual bits of work.

It's a lot easier recursively...

Quicksort is by definition a "divide and conquer" searching algorithm, the idea is that you split the given array into smaller partitions. So you are dividing the problem into subproblems, that is easier to solve. When using Quicksort without recursion you need a struct of some sort to store the partitions you are not using at the time. That's why the answer of the post uses an array to make quicksort non recursive.

Can quicksort be implemented in C without stack and recursion?

Quicksort requires two paths be followed forward from each non-trivial partitioning: a new partitioning of each (sub)partition. Information about the previous partitioning (the bounds of one of the resulting partitions) needs to be carried forward to each new partitioning. The question, then, is where does that information live? In particular, where does the information about one partition live while the program is working on the other?

For a serial algorithm, the answer is that the information is stored on a stack or a queue or a functional equivalent of one of those. Always, because those are our names for data structures that serve the needed purpose. In particular, recursion is a special case, not an alternative. In a recursive quicksort, the data are stored on the call stack. For an iterative implementation you can implement a stack in a formal sense, but it's possible to instead use a simple and relatively small array as a makeshift stack.

But stack and queue equivalents can go a lot farther than that. You could append data to a file, for example, for later read-back. You could write it to a pipe. You could transmit it to yourself asynchronously over a communications network.

If you wanted to go crazy, you could even nest iterations in place of recursing. That would impose a hard upper bound on the size of the arrays that could be handled, but not as tight of one as you might think. With some care and a few tricks, you could handle billion-element arrays with a 25-loop nest. Such a deep nest would be ugly and crazy, but nevertheless conceivable. A human could write it by hand. And in that case, the series of nested loop scopes, with their block-scoped variables, serves as a stack equivalent.

So the answer depends on what exactly you mean by "without stack":

  • yes, you can use a queue instead, though it would need to have about the same capacity as there are elements to sort;
  • yes, you can use an array or some other kind of sequential data storage to emulate a formal stack or queue;
  • yes, you can encode a suitable stack equivalent directly into the structure of your program;
  • yes, you can probably come up with other, more esoteric versions of stacks and queues;
  • but no, you cannot perform a quicksort without something filling the multi-level data-storage role for which a stack or stack-equivalent is conventionally used.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM