Is this a correctly implemented mergesort in Haskell?

Question

I could not find my code anywhere on the net, so can you please tell me why or why not the function myMergeSort is a mergesort? I know my function myMergeSort sorts, but am not sure if it really sorts using the mergesort algorithm or if it is a different algorithm. I just began with Haskell a few days ago.

merge xs [] = xs
merge [] ys = ys
merge (x : xs) (y : ys)
    | x <= y = x : merge xs (y : ys)
    | otherwise = y : merge (x : xs) ys

myMergeSort :: [Int] -> [Int]
myMergeSort [] = []
myMergeSort (x:[]) = [x]
myMergeSort (x:xs) = foldl merge [] (map (\x -> [x]) (x:xs))

I have no questions about the merge function.

The following function mergeSortOfficial was the solution presented to us, I understand it but am not sure if I am implementing the mergesort algorithm in my function myMergeSort correctly or not.

Official solution - implemenation:

mergeSortOfficial [] = []
mergeSortOfficial (x : []) = [x]
mergeSortOfficial xs = merge
    (mergeSortOfficial (take ((length xs) ‘div‘ 2) xs))
    (mergeSortOfficial (drop ((length xs) ‘div‘ 2) xs))

Answer 1

No, that's not mergeSort . That's insertionSort , which is essentially the same algorithm as bubbleSort , depending on how you stare at it. At each step, a singleton list is merge d with the accumulated ordered-list-so-far, so, effectively, the element of that singleton is inserted.

As other commenters have already observed, to get mergeSort (and in particular, its efficiency), it's necessary to divide the problem repeatedly into roughly equal parts (rather than "one element" and "the rest"). The "official" solution gives a rather clunky way to do that. I quite like

foldr (\ x (ys, zs) -> (x : zs, ys)) ([], [])

as a way to split a list in two, not in the middle, but into elements in even and odd positions.

If, like me, you like to have structure up front where you can see it, you can make ordered lists a Monoid .

import Data.Monoid
import Data.Foldable
import Control.Newtype

newtype Merge x = Merge {merged :: [x]}
instance Newtype (Merge x) [x] where
  pack = Merge
  unpack = merged

instance Ord x => Monoid (Merge x) where
  mempty = Merge []
  mappend (Merge xs) (Merge ys) = Merge (merge xs ys) where
    -- merge is as you defined it

And now you have insertion sort just by

ala' Merge foldMap (:[]) :: [x] -> [x]

One way to get the divide-and-conquer structure of mergeSort is to make it a data structure: binary trees.

data Tree x = None | One x | Node (Tree x) (Tree x) deriving Foldable

I haven't enforced a balancing invariant here, but I could. The point is that the same operation as before has another type

ala' Merge foldMap (:[]) :: Tree x -> [x]

which merges lists collected from a treelike arrangement of elements. To obtain said arrangements, think "what's cons for Tree ?" and make sure you keep your balance, by the same kind of twistiness I used in the above "dividing" operation.

twistin :: x -> Tree x -> Tree x   -- a very cons-like type
twistin x None        = One x
twistin x (One y)     = Node (One x) (One y)
twistin x (Node l r)  = Node (twistin x r) l

Now you have mergeSort by building a binary tree, then merging it.

mergeSort :: Ord x => [x] -> [x]
mergeSort = ala' Merge foldMap (:[]) . foldr twistin None

Of course, introducing the intermediate data structure has curiosity value, but you can easily cut it out and get something like

mergeSort :: Ord x => [x] -> [x]
mergeSort []   = []
mergeSort [x]  = [x]
mergeSort xs   = merge (mergeSort ys) (mergeSort zs) where
  (ys, zs) = foldr (\ x (ys, zs) -> (x : zs, ys)) ([], []) xs

where the tree has become the recursion structure of the program.

Answer 2

myMergeSort is not a correct merge sort. It is a correct insertion sort though. We start with an empty list, then insert the elements one-by-one into the correct position:

myMergeSort [2, 1, 4, 3] == 
foldl merge [] [[2], [1], [4], [3]] ==
((([] `merge` [2]) `merge` [1]) `merge` [4]) `merge` [3] == 
(([2] `merge` [1]) `merge` [4]) `merge` [3]
([1, 2] `merge` [4]) `merge` [3] == 
[1, 2, 4] `merge` [3] == 
[1, 2, 3, 4]

Since each insertion takes linear time, the whole sort is quadratic.

mergeSortOfficial is technically right, but it's inefficient. length takes linear time, and it's called at each level of recursion for the total length of the list. take and drop are also linear. The overall complexity remains the optimal n * log n , but we run a couple of unnecessary circles.

If we stick to top-down merging, we could do better with splitting the list to a list of elements with even indices and another with odd indices. Splitting is still linear, but it's only a single traversal instead of two ( length and then take / drop in the official sort).

split :: [a] -> ([a], [a])
split = go [] [] where
  go as bs []     = (as, bs)
  go as bs (x:xs) = go (x:bs) as xs

mergeSortOfficial :: [Int] -> [Int]
mergeSortOfficial [] = []
mergeSortOfficial (x : []) = [x]
mergeSortOfficial xs = 
  let (as, bs) = split xs in
    merge (mergeSortOfficial as) (mergeSortOfficial bs)

As WillNess noted in the comments, the above split yields an unstable sort. We can use a stable alternative:

import Control.Arrow

stableSplit :: [a] -> ([a], [a])
stableSplit xs = go xs xs where
    go (x:xs) (_:_:ys) = first (x:) (go xs ys)
    go xs     ys       = ([], xs)

The best way is probably doing a bottom-up merge. It's the approach the sort in Data.List takes. Here we merge consecutive pairs of lists until there is only a single list left:

mergeSort :: Ord a => [a] -> [a]
mergeSort [] = []
mergeSort xs = mergeAll (map (:[]) xs) where
    mergePairs (x:y:ys) = merge x y : mergePairs ys
    mergePairs xs       = xs

    mergeAll [xs] = xs
    mergeAll xs   = mergeAll (mergePairs xs)

Data.List.sort works largely the same as above, except it starts with finding descending and ascending runs in the input instead of just creating singleton lists from the elements.

Is this a correctly implemented mergesort in Haskell?

Question

2 answers

solution1
10 2015-03-11 18:01:08

solution2
7 ACCPTED 2015-03-11 17:52:12

Is this a correctly implemented mergesort in Haskell?

Question

2 answers

solution1 10 2015-03-11 18:01:08

solution2 7 ACCPTED 2015-03-11 17:52:12

solution1
10 2015-03-11 18:01:08

solution2
7 ACCPTED 2015-03-11 17:52:12