简体   繁体   English

这是Haskell中正确实现的mergesort吗?

[英]Is this a correctly implemented mergesort in Haskell?

I could not find my code anywhere on the net, so can you please tell me why or why not the function myMergeSort is a mergesort? 我无法在网上的任何地方找到我的代码,所以你能告诉我为什么或为什么函数myMergeSort不是一个mergesort? I know my function myMergeSort sorts, but am not sure if it really sorts using the mergesort algorithm or if it is a different algorithm. 我知道我的函数myMergeSort排序,但我不确定它是否真的使用mergesort算法进行排序,或者它是否是一个不同的算法。 I just began with Haskell a few days ago. 我几天前刚开始使用Haskell。

merge xs [] = xs
merge [] ys = ys
merge (x : xs) (y : ys)
    | x <= y = x : merge xs (y : ys)
    | otherwise = y : merge (x : xs) ys

myMergeSort :: [Int] -> [Int]
myMergeSort [] = []
myMergeSort (x:[]) = [x]
myMergeSort (x:xs) = foldl merge [] (map (\x -> [x]) (x:xs))

I have no questions about the merge function. 我对合并功能没有任何疑问。

The following function mergeSortOfficial was the solution presented to us, I understand it but am not sure if I am implementing the mergesort algorithm in my function myMergeSort correctly or not. 以下函数mergeSortOfficial是我们提供的解决方案,我理解它但不确定我是否正确地在我的函数myMergeSort中实现mergesort算法。

Official solution - implemenation: 官方解决方案 - 实施:

mergeSortOfficial [] = []
mergeSortOfficial (x : []) = [x]
mergeSortOfficial xs = merge
    (mergeSortOfficial (take ((length xs) ‘div‘ 2) xs))
    (mergeSortOfficial (drop ((length xs) ‘div‘ 2) xs))

No, that's not mergeSort . 不,那不是mergeSort That's insertionSort , which is essentially the same algorithm as bubbleSort , depending on how you stare at it. 这是插入排序 ,这在本质上是相同的算法冒泡 ,这取决于你如何盯着它。 At each step, a singleton list is merge d with the accumulated ordered-list-so-far, so, effectively, the element of that singleton is inserted. 在每个步骤中,单个列表与累积的有序列表merge d到目前为止,因此,有效地,插入该单个元素的元素。

As other commenters have already observed, to get mergeSort (and in particular, its efficiency), it's necessary to divide the problem repeatedly into roughly equal parts (rather than "one element" and "the rest"). 正如其他评论者已经观察到的那样,为了获得mergeSort (特别是其效率),有必要将问题重复分成大致相等的部分(而不是“一个元素”和“其余部分”)。 The "official" solution gives a rather clunky way to do that. “官方”解决方案提供了一种相当笨重的方法。 I quite like 我相当喜欢

foldr (\ x (ys, zs) -> (x : zs, ys)) ([], [])

as a way to split a list in two, not in the middle, but into elements in even and odd positions. 作为一种方法将列表分成两部分,而不是在中间,而是分成偶数和奇数位置的元素。

If, like me, you like to have structure up front where you can see it, you can make ordered lists a Monoid . 如果像我一样,你喜欢在前面有结构,你可以看到它,你可以使有序列表成为Monoid

import Data.Monoid
import Data.Foldable
import Control.Newtype

newtype Merge x = Merge {merged :: [x]}
instance Newtype (Merge x) [x] where
  pack = Merge
  unpack = merged

instance Ord x => Monoid (Merge x) where
  mempty = Merge []
  mappend (Merge xs) (Merge ys) = Merge (merge xs ys) where
    -- merge is as you defined it

And now you have insertion sort just by 现在你只需要插入排序

ala' Merge foldMap (:[]) :: [x] -> [x]

One way to get the divide-and-conquer structure of mergeSort is to make it a data structure: binary trees. 获得mergeSort的分而治之结构的一种方法是使其成为一种数据结构:二叉树。

data Tree x = None | One x | Node (Tree x) (Tree x) deriving Foldable

I haven't enforced a balancing invariant here, but I could. 我没有强制执行平衡不变量,但我可以。 The point is that the same operation as before has another type 关键是与之前相同的操作有另一种类型

ala' Merge foldMap (:[]) :: Tree x -> [x]

which merges lists collected from a treelike arrangement of elements. 它合并了从树状排列的元素中收集的列表。 To obtain said arrangements, think "what's cons for Tree ?" 要获得上述安排,请考虑“ Tree的缺点是什么?” and make sure you keep your balance, by the same kind of twistiness I used in the above "dividing" operation. 并确保通过我在上述“分割”操作中使用的相同扭曲来保持平衡。

twistin :: x -> Tree x -> Tree x   -- a very cons-like type
twistin x None        = One x
twistin x (One y)     = Node (One x) (One y)
twistin x (Node l r)  = Node (twistin x r) l

Now you have mergeSort by building a binary tree, then merging it. 现在你通过构建二叉树然后合并它来进行mergeSort。

mergeSort :: Ord x => [x] -> [x]
mergeSort = ala' Merge foldMap (:[]) . foldr twistin None

Of course, introducing the intermediate data structure has curiosity value, but you can easily cut it out and get something like 当然,引入中间数据结构具有好奇心的价值,但你可以轻松地将其删除并获得类似的东西

mergeSort :: Ord x => [x] -> [x]
mergeSort []   = []
mergeSort [x]  = [x]
mergeSort xs   = merge (mergeSort ys) (mergeSort zs) where
  (ys, zs) = foldr (\ x (ys, zs) -> (x : zs, ys)) ([], []) xs

where the tree has become the recursion structure of the program. 树已经成为程序的递归结构。

myMergeSort is not a correct merge sort. myMergeSort不是正确的合并排序。 It is a correct insertion sort though. 这是一个正确的插入排序 We start with an empty list, then insert the elements one-by-one into the correct position: 我们从一个空列表开始,然后将元素逐个插入到正确的位置:

myMergeSort [2, 1, 4, 3] == 
foldl merge [] [[2], [1], [4], [3]] ==
((([] `merge` [2]) `merge` [1]) `merge` [4]) `merge` [3] == 
(([2] `merge` [1]) `merge` [4]) `merge` [3]
([1, 2] `merge` [4]) `merge` [3] == 
[1, 2, 4] `merge` [3] == 
[1, 2, 3, 4]

Since each insertion takes linear time, the whole sort is quadratic. 由于每次插入都需要线性时间,因此整个排序是二次的。

mergeSortOfficial is technically right, but it's inefficient. mergeSortOfficial在技​​术上是正确的,但它效率低下。 length takes linear time, and it's called at each level of recursion for the total length of the list. length需要线性时间,并且在每个递归级别调用列表的总长度。 take and drop are also linear. takedrop也是线性的。 The overall complexity remains the optimal n * log n , but we run a couple of unnecessary circles. 整体复杂性仍然是最佳的n * log n ,但我们运行了几个不必要的循环。

If we stick to top-down merging, we could do better with splitting the list to a list of elements with even indices and another with odd indices. 如果我们坚持自上而下合并,我们可以做得更好,将列表拆分为具有偶数索引的元素列表和另一个具有奇数索引的元素。 Splitting is still linear, but it's only a single traversal instead of two ( length and then take / drop in the official sort). 拆分仍然是线性的,但它只是一次遍历而不是两次( length然后在official排序中take / drop )。

split :: [a] -> ([a], [a])
split = go [] [] where
  go as bs []     = (as, bs)
  go as bs (x:xs) = go (x:bs) as xs

mergeSortOfficial :: [Int] -> [Int]
mergeSortOfficial [] = []
mergeSortOfficial (x : []) = [x]
mergeSortOfficial xs = 
  let (as, bs) = split xs in
    merge (mergeSortOfficial as) (mergeSortOfficial bs)

As WillNess noted in the comments, the above split yields an unstable sort. 正如WillNess在评论中指出的那样,上面的split产生了不稳定的排序。 We can use a stable alternative: 我们可以使用稳定的替代品:

import Control.Arrow

stableSplit :: [a] -> ([a], [a])
stableSplit xs = go xs xs where
    go (x:xs) (_:_:ys) = first (x:) (go xs ys)
    go xs     ys       = ([], xs)

The best way is probably doing a bottom-up merge. 最好的方法可能是进行自下而上的合并。 It's the approach the sort in Data.List takes. 这是Data.Listsort方法。 Here we merge consecutive pairs of lists until there is only a single list left: 在这里,我们合并连续的列表对,直到只剩下一个列表:

mergeSort :: Ord a => [a] -> [a]
mergeSort [] = []
mergeSort xs = mergeAll (map (:[]) xs) where
    mergePairs (x:y:ys) = merge x y : mergePairs ys
    mergePairs xs       = xs

    mergeAll [xs] = xs
    mergeAll xs   = mergeAll (mergePairs xs)

Data.List.sort works largely the same as above, except it starts with finding descending and ascending runs in the input instead of just creating singleton lists from the elements. Data.List.sort与上面的工作方式基本相同,不同之处在于它首先在输入中查找降序和升序运行,而不是仅从元素创建单例列表。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM