简体   繁体   中英

Why does radix sort have a space complexity of O(k + n)?

Consider an array with n numbers that has maximum k digits (See Edit). Consider the radix sort program from here :

def radixsort( aList ):
  RADIX = 10
  maxLength = False
  tmp, placement = -1, 1

  while not maxLength:
    maxLength = True
    # declare and initialize buckets
    buckets = [list() for _ in range( RADIX )]

    # split aList between lists
    for  i in aList:
      tmp = i / placement
      buckets[tmp % RADIX].append( i )
      if maxLength and tmp > 0:
        maxLength = False

    # empty lists into aList array
    a = 0
    for b in range( RADIX ):
      buck = buckets[b]
      for i in buck:
        aList[a] = i
        a += 1

    # move to next digit
    placement *= RADIX

The buckets basically is a 2d list of all the numbers. However, only n values will be added to it. How come the space complexity is O(k + n) and not O(n)? Correct me if I am wrong, even if we consider the space used to extract digits in a particular place, it is only using 1 (constant) memory space?

Edit : I would like to explain my understanding of k . Suppose I give an input of [12, 13, 65, 32, 789, 1, 3] , the algorithm given in the link would go through 4 passes (of first while loop inside the function). Here k = 4, ie maximum no. of digits for any element in the array + 1. Thus k is no. of passes. This is the same k involved in time complexity of this algorithm: O(kn) which makes sense. I am not able to understand how it plays a role in space complexity: O(k + n) .

Radix sort's space complexity is bound to the sort it uses to sort each radix. In best case, that is counting sort.

Here is the pseudocode provided by CLRS for counting sort:

Counting-sort(A,B,k)
  let C[0..k] be a new array
  for i = 0 to k
      C[i] = o
  for j = 1 to A.length
      C[A[j]] = C[A[j]] + 1
  for i = 1 to k
      C[i] = C[i] + C[i-1]
  for j = A.length down to 1
      B[C[A[j]]] = A[j]
      C[A[j]] = C[A[j]] - 1 

As you can see, counting sort creates multiple arrays, one based on the size of K, and one based on the size of N. B is the output array which is size n. C is an auxiliary array of size k.

Because radix sort uses counting sort, counting sort's space complexity is the lower bound of radix sort's space complexity.

I think that there is a terminological issue. The space complexity of the question's implementation and implementation mentioned in the Jayson Boubin's answer is O(n+k) . But k is not the length of the longest word (or longest number). k is a size of an 'alphabet': number of different digits (in numbers) or letters (in words).

buckets = [list() for _ in range( RADIX )]

This code creates an array with RADIX elements. In this particular implementation RADIX is a constant (and the space complexity is O(n)), but in general, it's a variable. RADIX is a k , the number of different digits (letters in the alphabet). And this k does not depend on n and can be larger than n in some cases, so the space complexity is O(n+k) in general.

Edit : In this implementation the size of placement (or tmp ) is O(k) (with your definition of k ), because k is log(maxNumber) base 10 , and placement size is log(maxNumber) base 256 . But I'm not sure this is a general case.

Radix sort uses Counting sort for each digit of numbers in the dataset. Counting sort has space complexity of O(n+k) where k is the largest number in the dataset.

Decimal digits range from 0 to 9 so if we sort 4 decimal numbers (11,22,88,99) using radix sort (counting sort used within radix sort), for each digit, it will create array of size b = 10 where b is the base.

It means that the total space used would be total digits * (n + base). If total digit are constant. The space complexity becomes O(n+base).

Hence the space complexity of Radix Sort is O(n+b).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM