简体   繁体   English

在Haskell中使用尾递归拆分BinTree

[英]Splitting a BinTree with tail recursion in Haskell

So this week we learned about union types, tail recursion and binary trees in Haskell. 所以本周我们在Haskell中了解了联合类型,尾递归和二叉树。 We defined our tree data type like so: 我们定义了我们的树数据类型:

data BinTree a = Empty
           | Node (BinTree a) a (BinTree a)
           deriving (Eq, Show)

leaf :: a -> BinTree a
leaf x = Node Empty x Empty

Now we were asked to write a function to find the most left node, return it, cut it out and also return the remaining tree without the node we just cut. 现在我们被要求编写一个函数来查找最左边的节点,返回它,将其剪切掉,并返回剩下的树而没有我们刚剪切的节点。

We did something like this, which worked quite well: 我们做了类似的事情,效果很好:

splitleftmost :: BinTree a -> Maybe (a, BinTree a)
splitleftmost Empty = Nothing
splitleftmost (Node l a r) = case splitleftmost l of
                                 Nothing -> Just (a, r)
                                 Just (a',l') -> Just (a', Node l' a r)

Now I need to make this function tail recursive. 现在我需要使这个函数尾递归。 I think I understood what tail recursion is about, but found it hard to apply it to this problem. 我想我理解尾递归是什么,但发现很难将它应用于这个问题。 I was told to write a function which calls the main function with the fitting arguments, but was still not able to solve this. 我被告知编写一个函数,它使用拟合参数调用main函数,但仍然无法解决这个问题。

Since nodes do not have a parent link, one approach would be to maintain root-to-leaf path within a list. 由于节点没有父链接,因此一种方法是在列表中维护根到叶路径。 At the end the modified tree can be constructed using a left fold: 最后,可以使用左折叠构造修改后的树:

slm :: BinTree a -> Maybe (a, BinTree a)
slm = run []
    where
    run _ Empty = Nothing
    run t (Node Empty x r) = Just (x, foldl go r t)
        where go l (Node _ x r) = Node l x r

    run t n@(Node l _ _) = run (n:t) l

Here, not to spoil anything, are some "tail recursive" definitions of functions for summing along the left and right branches, at least as I understand "tail recursion": 这里,不要破坏任何东西,是一些“尾递归”的函数定义,用于沿左右分支求和,至少我理解“尾递归”:

sumLeftBranch tree = loop 0 tree where
  loop n Empty        = n
  loop n (Node l a r) = loop (n+a) l

sumRightBranch tree = loop 0 tree where
  loop n Empty        = n
  loop n (Node l a r) = loop (n+a) r

You can see that all the recursive uses of loop will have the same answer as the first call loop 0 tree - the arguments just keep getting put into better and better shape, til they are in the ideal shape, loop n Empty , which is n , the desired sum. 你可以看到循环的所有递归使用将与第一个调用loop 0 tree具有相同的答案 - 参数只是保持更好和更好的形状,直到它们处于理想的形状, loop n Empty ,这是n ,期望的总和。

If this is the kind of thing that is wanted, the setup for splitleftmost would be 如果这是想要的东西,那么splitleftmost的设置就是

splitLeftMost tree = loop Nothing tree 
  where
  loop m              Empty        = m
  loop Nothing        (Node l a r) = loop ? ? 
  loop (Just (a',r')) (Node l a r) = loop ? ?

Here, the first use of loop is in the form of loop Nothing tree , but that's the same as loop result Empty - when we come to it, namely result . 这里, loop的第一个用途是loop Nothing tree的形式,但是这与loop result Empty相同 - 当我们来到它时,即result It took me a couple of tries to get the missing arguments to loop ? ? 我花了几次尝试让缺少的参数loop ? ? loop ? ? right, but, as usual, they were obvious once I got them. 是的,但是,像往常一样,一旦我拿到它们,它们就很明显了。

As others have hinted, there is no reason, in Haskell, to make this function tail-recursive. 正如其他人所暗示的那样,在Haskell中没有理由使这个函数尾递归。 In fact, a tail-recursive solution will almost certainly be slower than the one you have devised! 事实上,尾递归解决方案几乎肯定会慢于你设计的解决方案! The main potential inefficiencies in the code you've provided involve allocation of pair and Just constructors. 您提供的代码中潜在的低效率主要涉及对和Just构造函数的分配。 I believe GHC (with optimization enabled) will be able to figure out how to avoid these. 我相信GHC(启用优化)将能够弄清楚如何避免这些。 My guess is that its ultimate code will probably look something like this: 我的猜测是它的最终代码可能看起来像这样:

splitleftmost :: BinTree a -> Maybe (a, BinTree a)
splitleftmost Empty = Nothing
splitleftmost (Node l a r) =
  case slm l a r of
    (# hd, tl #) -> Just (hd, tl)

slm :: BinTree a -> a -> BinTree a
    -> (# a, BinTree a #)
slm Empty a r = (# a, r #)
slm (Node ll la lr) a r =
  case slm ll la lr of
    (# hd, tl' #) -> (# hd, Node tl' a r #)

Those funny-looking (# ..., ... #) things are unboxed pairs , which are handled pretty much like multiple return values. 那些看起来很滑稽的(# ..., ... #)东西是未装箱的对 ,它们的处理方式与多个返回值非常相似。 In particular, no actual tuple constructor is allocated until the end. 特别是,直到结束才分配实际的元组构造函数。 By recognizing that every invocation of splitleftmost with a non-empty tree will produce a Just result, we (and thus almost certainly GHC) can separate the empty case from the rest to avoid allocating intermediate Just constructors. 通过识别每个使用非空树的splitleftmost调用将产生Just结果,我们(因此几乎肯定GHC)可以将空案例与其余案件分开,以避免分配中间Just构造函数。 So this final code only allocates stack frames to handle the recursive results. 所以这个最终代码只分配堆栈帧来处理递归结果。 Since some representation of such a stack is inherently necessary to solve this problem, using GHC's built-in one seems pretty likely to give the best results. 由于这种堆栈的某些表示本身就是解决这个问题的必要条件,因此使用GHC内置的堆栈似乎很可能会产生最佳结果。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM