I could not find my code anywhere on the net, so can you please tell me why or why not the function myMergeSort is a mergesort? I know my function myMergeSort sorts, but am not sure if it really sorts using the mergesort algorithm or if it is a different algorithm. I just began with Haskell a few days ago.
merge xs [] = xs
merge [] ys = ys
merge (x : xs) (y : ys)
| x <= y = x : merge xs (y : ys)
| otherwise = y : merge (x : xs) ys
myMergeSort :: [Int] -> [Int]
myMergeSort [] = []
myMergeSort (x:[]) = [x]
myMergeSort (x:xs) = foldl merge [] (map (\x -> [x]) (x:xs))
I have no questions about the merge function.
The following function mergeSortOfficial was the solution presented to us, I understand it but am not sure if I am implementing the mergesort algorithm in my function myMergeSort correctly or not.
Official solution - implemenation:
mergeSortOfficial [] = []
mergeSortOfficial (x : []) = [x]
mergeSortOfficial xs = merge
(mergeSortOfficial (take ((length xs) ‘div‘ 2) xs))
(mergeSortOfficial (drop ((length xs) ‘div‘ 2) xs))
No, that's not mergeSort . That's insertionSort , which is essentially the same algorithm as bubbleSort , depending on how you stare at it. At each step, a singleton list is merge
d with the accumulated ordered-list-so-far, so, effectively, the element of that singleton is inserted.
As other commenters have already observed, to get mergeSort (and in particular, its efficiency), it's necessary to divide the problem repeatedly into roughly equal parts (rather than "one element" and "the rest"). The "official" solution gives a rather clunky way to do that. I quite like
foldr (\ x (ys, zs) -> (x : zs, ys)) ([], [])
as a way to split a list in two, not in the middle, but into elements in even and odd positions.
If, like me, you like to have structure up front where you can see it, you can make ordered lists a Monoid
.
import Data.Monoid
import Data.Foldable
import Control.Newtype
newtype Merge x = Merge {merged :: [x]}
instance Newtype (Merge x) [x] where
pack = Merge
unpack = merged
instance Ord x => Monoid (Merge x) where
mempty = Merge []
mappend (Merge xs) (Merge ys) = Merge (merge xs ys) where
-- merge is as you defined it
And now you have insertion sort just by
ala' Merge foldMap (:[]) :: [x] -> [x]
One way to get the divide-and-conquer structure of mergeSort is to make it a data structure: binary trees.
data Tree x = None | One x | Node (Tree x) (Tree x) deriving Foldable
I haven't enforced a balancing invariant here, but I could. The point is that the same operation as before has another type
ala' Merge foldMap (:[]) :: Tree x -> [x]
which merges lists collected from a treelike arrangement of elements. To obtain said arrangements, think "what's cons for Tree
?" and make sure you keep your balance, by the same kind of twistiness I used in the above "dividing" operation.
twistin :: x -> Tree x -> Tree x -- a very cons-like type
twistin x None = One x
twistin x (One y) = Node (One x) (One y)
twistin x (Node l r) = Node (twistin x r) l
Now you have mergeSort by building a binary tree, then merging it.
mergeSort :: Ord x => [x] -> [x]
mergeSort = ala' Merge foldMap (:[]) . foldr twistin None
Of course, introducing the intermediate data structure has curiosity value, but you can easily cut it out and get something like
mergeSort :: Ord x => [x] -> [x]
mergeSort [] = []
mergeSort [x] = [x]
mergeSort xs = merge (mergeSort ys) (mergeSort zs) where
(ys, zs) = foldr (\ x (ys, zs) -> (x : zs, ys)) ([], []) xs
where the tree has become the recursion structure of the program.
myMergeSort
is not a correct merge sort. It is a correct insertion sort though. We start with an empty list, then insert the elements one-by-one into the correct position:
myMergeSort [2, 1, 4, 3] ==
foldl merge [] [[2], [1], [4], [3]] ==
((([] `merge` [2]) `merge` [1]) `merge` [4]) `merge` [3] ==
(([2] `merge` [1]) `merge` [4]) `merge` [3]
([1, 2] `merge` [4]) `merge` [3] ==
[1, 2, 4] `merge` [3] ==
[1, 2, 3, 4]
Since each insertion takes linear time, the whole sort is quadratic.
mergeSortOfficial
is technically right, but it's inefficient. length
takes linear time, and it's called at each level of recursion for the total length of the list. take
and drop
are also linear. The overall complexity remains the optimal n * log n
, but we run a couple of unnecessary circles.
If we stick to top-down merging, we could do better with splitting the list to a list of elements with even indices and another with odd indices. Splitting is still linear, but it's only a single traversal instead of two ( length
and then take
/ drop
in the official
sort).
split :: [a] -> ([a], [a])
split = go [] [] where
go as bs [] = (as, bs)
go as bs (x:xs) = go (x:bs) as xs
mergeSortOfficial :: [Int] -> [Int]
mergeSortOfficial [] = []
mergeSortOfficial (x : []) = [x]
mergeSortOfficial xs =
let (as, bs) = split xs in
merge (mergeSortOfficial as) (mergeSortOfficial bs)
As WillNess noted in the comments, the above split
yields an unstable sort. We can use a stable alternative:
import Control.Arrow
stableSplit :: [a] -> ([a], [a])
stableSplit xs = go xs xs where
go (x:xs) (_:_:ys) = first (x:) (go xs ys)
go xs ys = ([], xs)
The best way is probably doing a bottom-up merge. It's the approach the sort
in Data.List
takes. Here we merge consecutive pairs of lists until there is only a single list left:
mergeSort :: Ord a => [a] -> [a]
mergeSort [] = []
mergeSort xs = mergeAll (map (:[]) xs) where
mergePairs (x:y:ys) = merge x y : mergePairs ys
mergePairs xs = xs
mergeAll [xs] = xs
mergeAll xs = mergeAll (mergePairs xs)
Data.List.sort
works largely the same as above, except it starts with finding descending and ascending runs in the input instead of just creating singleton lists from the elements.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.