简体   繁体   English

如何在 Haskell 中编写 N 元树遍历函数

[英]How to write function for N-ary tree traversal in Haskell

I need to traverse N-ary tree and to the each node add number when I visited in in preorder.当我按顺序访​​问时,我需要遍历 N 元树并向每个节点添加数字。 I have n-ary tree defined like this:我有这样定义的 n 叉树:

data NT a = N a [NT a] deriving Show

Example: If I have following tree:示例:如果我有以下树:

let ntree = N "eric" [N "lea" [N "kristy" [],N "pedro" [] ,N "rafael" []],N "anna" [],N "bety" []]

I want to transform it to我想把它转换成

let ntree = N (1,"eric") [N (2,"lea") [N (3,"kristy") [],N (4,"pedro") [] ,N (5,"rafael") []],N (6,"anna") [],N (7,"bety") []]

"Preordedness" isnt that important. “先入为主”没那么重要。

I want to see how to write a function that passes values between levels , like how to pass number down to successor list and how to pass updated number to parent and go with that number to other branches.我想看看如何编写一个在级别之间传递值的函数,例如如何将数字向下传递给后继列表以及如何将更新的数字传递给父级并将该数字传递给其他分支。

So far I has been able to write functions like this:到目前为止,我已经能够编写这样的函数:

traverse :: NT String -> String
traverse (N val []) =" "++val++" "
traverse (N val list) =val++" " ++ (concat $ map  traverse list)

which outputs哪个输出

"eric lea  kristy  pedro  rafael  anna  bety "

EDIT: Question is:编辑:问题是:

How can I write a function我怎样才能写一个函数

numberNodes :: NT a -> NT (a,Int)

that numbers nodes according to the preorder traversal of the tree?根据树的前序遍历对节点进行编号?

Hard part for me to understand is passing auxilliary data around, could you please elaborate on that ?我很难理解的是传递辅助数据,你能详细说明一下吗?

In this concrete case it is one Int that means "time" or order in which I traverse this tree.在这种具体情况下,一个 Int 表示我遍历这棵树的“时间”或顺序。

First Attempt: Hard Work第一次尝试:努力工作

For the case of n-ary trees, there are three things going on: numbering elements, numbering trees, and numbering lists of trees.对于 n 叉树,需要进行件事:元素编号、树编号和树的编号列表 It would help to treat them separately.将它们分开处理会有所帮助。 Types first:类型优先:

aNumber   :: a                -- thing to number
          -> Int              -- number to start from
          -> ( (a, Int)       -- numbered thing
             , Int            -- next available number afterwards
             )

ntNumber  :: NT a             -- thing to number
          -> Int              -- number to start from
          -> ( NT (a, Int)    -- numbered thing
             , Int            -- next available number afterwards
             )

ntsNumber :: [NT a]           -- thing to number
          -> Int              -- number to start from
          -> ( [NT (a, Int)]  -- numbered thing
             , Int            -- next available number afterwards
             )

Notice that all three types share the same pattern.请注意,所有三种类型共享相同的模式。 When you see that there is a pattern that you are following, apparently by coincidence, you know you have an opportunity to learn something.当您发现自己遵循某种模式时,显然是巧合,您就知道自己有机会学到一些东西。 But let's press on for now and learn later.但是让我们先按一下,稍后再学习。

Numbering an element is easy: copy the starting number into the output and return its successor as the next available.给元素编号很容易:将起始编号复制到输出中,然后将其后继作为下一个可用编号返回。

aNumber a i = ((a, i), i + 1)

For the other two, the pattern (there's that word again) is对于另外两个,模式(又是那个词)是

  1. split the input into its top-level components将输入拆分为其顶级组件
  2. number each component in turn, threading the numbers through依次为每个组件编号,将数字穿过

It's easy to do the first with pattern matching (inspecting the data visually) and the second with where clauses (grabbing the two parts of the output).很容易用模式匹配(视觉检查数据)和where子句(获取输出的两个部分)来完成第一个。

For trees, a top level split gives us two components: an element and a list.对于树,顶级拆分为我们提供了两个组件:元素和列表。 In the where clause, we call the appropriate numbering functions as directed by those types.在 where 子句中,我们按照这些类型的指示调用适当的编号函数。 In each case, the "thing" output tells us what to put in place of the "thing" input.在每种情况下,“事物”输出都会告诉我们用什么来代替“事物”输入。 Meanwhile, we thread the numbers through, so the starting number for the whole is the starting number for the first component, the "next" number for the first component starts the second, and the "next" number from the second is the "next" number for the whole.同时,我们将数字串连起来,所以整体的起始编号是第一个组件的起始编号,第一个组件的“下一个”数字开始第二个,第二个的“下一个”数字是“下一个” ”的数字。

ntNumber (N a ants) i0  = (N ai aints, i2) where
  (ai,    i1) = aNumber   a    i0
  (aints, i2) = ntsNumber ants i1

For lists, we have two possibilities.对于列表,我们有两种可能性。 An empty list has no components, so we return it directly without using any more numbers.一个空列表没有组件,所以我们直接返回它而不使用更多的数字。 A "cons" has two components, we do exactly as we did before, using the appropriate numbering functions as directed by the type. “缺点”有两个组成部分,我们完全像以前一样,按照类型使用适当的编号函数。

ntsNumber []           i  = ([], i)
ntsNumber (ant : ants) i0 = (aint : aints, i2) where
  (aint,  i1) = ntNumber  ant  i0
  (aints, i2) = ntsNumber ants i1

Let's give it a go.让我们试一试吧。

> let ntree = N "eric" [N "lea" [N "kristy" [],N "pedro" [] ,N "rafael" []],N "anna" [],N "bety" []]
> ntNumber ntree 0
(N ("eric",0) [N ("lea",1) [N ("kristy",2) [],N ("pedro",3) [],N ("rafael",4) []],N ("anna",5) [],N ("bety",6) []],7)

So we're there.所以我们在那里。 But are we happy?但我们快乐吗? Well, I'm not.嗯,我不是。 I have the annoying sensation that I wrote pretty much the same type three times and pretty much the same program twice.我有一种恼人的感觉,我写了 3 次几乎相同的类型和几乎相同的程序两次。 And if I wanted to do more element-numbering for differently organised data (eg, your binary trees), I'd have to write the same thing again again.如果我想对不同组织的数据(例如,您的二叉树)进行更多的元素编号,我将不得不再次编写相同的内容。 Repetitive patterns in Haskell code are always missed opportunities: it's important to develop a sense of self-criticism and ask whether there's a neater way. Haskell 代码中的重复模式总是会错失机会:培养自我批评意识并询问是否有更简洁的方法很重要。

Second Attempt: Numbering and Threading第二次尝试:编号和线程

Two of the repetitive patterns we saw, above, are 1. the similarity of the types, 2. the similarity of the way the numbers get threaded.我们在上面看到的两个重复模式是 1. 类型的相似性,2. 数字串接方式的相似性。

If you match up the types to see what's in common, you'll notice they're all如果您匹配类型以查看共同点,您会注意到它们都是

input -> Int -> (output, Int)

for different inputs and outputs.用于不同的输入和输出。 Let's give the largest common component a name.让我们为最大的公共组件命名。

type Numbering output = Int -> (output, Int)

Now our three types are现在我们的三种类型是

aNumber   :: a      -> Numbering (a, Int)
ntNumber  :: NT a   -> Numbering (NT (a, Int))
ntsNumber :: [NT a] -> Numbering [NT (a, Int)]

You often see such types in Haskell:你经常在 Haskell 中看到这样的类型:

             input  -> DoingStuffToGet output

Now, to deal with the threading, we can build some helpful tools to work with and combine Numbering operations.现在,为了处理线程,我们可以构建一些有用的工具来处理和组合Numbering操作。 To see which tools we need, look at how we combine the outputs after we've numbered the components.要了解我们需要哪些工具,请查看在对组件进行编号后如何组合输出。 The "thing" parts of the outputs are always built by applying some functions which don't get numbered (data constructors, usually) to some "thing" outputs from numberings.输出的“事物”部分总是通过将一些未编号的函数(通常是数据构造函数)应用于编号的某些“事物”输出来构建。

To deal with the functions, we can build a gadget that looks a lot like our [] case, where no actual numbering was needed.为了处理函数,我们可以构建一个看起来很像我们的[]案例的小工具,其中不需要实际编号。

steady :: thing -> Numbering thing
steady x i = (x, i)

Don't be put off by the way the type makes it look as if steady has only one argument: remember that Numbering thing abbreviates a function type, so there really is another -> in there.不要被类型使它看起来好像方式被推迟steady只有一个参数:记住Numbering thing简写为函数类型,所以真的是另一个->在那里。 We get我们得到

steady [] :: Numbering [a]
steady [] i = ([], i)

just like in the first line of ntsNumber .就像在ntsNumber的第一行ntsNumber

But what about the other constructors, N and (:) ?但是其他构造函数N(:)呢? Ask ghci .ghci

> :t steady N
steady N :: Numbering (a -> [NT a] -> NT a)
> :t steady (:)
steady (:) :: Numbering (a -> [a] -> [a])

We get numbering operations with functions as outputs, and we want to generate the arguments to those function by more numbering operations, producing one big overall numbering operation with the numbers threaded through.我们得到以函数为输出的编号操作,我们希望通过更多的编号操作来生成这些函数的参数,从而产生一个大的整体编号操作,其中的数字是线程化的。 One step of that process is to feed a numbering-generated function one numbering-generated input.该过程的一个步骤是为编号生成的函数提供一个编号生成的输入。 I'll define that as an infix operator.我将其定义为中缀运算符。

($$) :: Numbering (a -> b) -> Numbering a -> Numbering b
infixl 2 $$

Compare with the type of the explicit application operator, $与显式应用运算符的类型相比, $

> :t ($)
($) :: (a -> b) -> a -> b

This $$ operator is "application for numberings".这个$$运算符是“编号应用程序”。 If we can get it right, our code becomes如果我们做对了,我们的代码就会变成

ntNumber  :: NT a -> Numbering (NT (a, Int))
ntNumber  (N a ants)   i = (steady N $$ aNumber a $$ ntsNumber ants) i

ntsNumber :: [NT a] -> Numbering [NT (a, Int)]
ntsNumber []           i = steady [] i
ntsNumber (ant : ants) i = (steady (:) $$ ntNumber ant $$ ntsNumber ants) i

with aNumber as it was (for the moment).使用aNumber原样(目前)。 This code just does the data reconstruction, plugging together the constructors and the numbering processes for the components.这段代码只是进行数据重建,将构造函数和组件的编号过程插入在一起。 We had better give the definition of $$ and make sure it gets the threading right.我们最好给出$$的定义并确保它得到正确的线程处理。

($$) :: Numbering (a -> b) -> Numbering a -> Numbering b
(fn $$ an) i0 = (f a, i2) where
  (f, i1) = fn i0
  (a, i2) = an i1

Here, our old threading pattern gets done once .在这里,我们的旧线程模式完成一次 Each of fn and an is a function, expecting a starting number, and the whole of fn $$ sn is a function, which gets the starting number i0 . fnan都是一个函数,需要一个起始编号,整个fn $$ sn是一个函数,它得到起始编号i0 We thread the numbers through, collecting first the function, then the argument.我们遍历数字,首先收集函数,然后是参数。 We then do the actual application and hand back the final "next" number.然后我们进行实际应用并交回最终的“下一个”数字。

Now, notice that in every line of code, the i input is fed in as the argument to a numbering process.现在,请注意,在每一行代码中,输入i作为编号过程的参数。 We can simplify this code by just talking about the processes , not the numbers .我们可以通过只讨论过程而不是数字来简化此代码。

ntNumber  :: NT a -> Numbering (NT (a, Int))
ntNumber  (N a ants)   = steady N $$ aNumber a $$ ntsNumber ants

ntsNumber :: [NT a] -> Numbering [NT (a, Int)]
ntsNumber []           = steady []
ntsNumber (ant : ants) = steady (:) $$ ntNumber ant $$ ntsNumber ants

One way to read this code is to filter out all the Numbering , steady and $$ uses.阅读此代码的一种方法是过滤掉所有Numberingsteady$$用途。

ntNumber  :: NT a -> ......... (NT (a, Int))
ntNumber  (N a ants)   = ...... N .. (aNumber a) .. (ntsNumber ants)

ntsNumber :: [NT a] -> ......... [NT (a, Int)]
ntsNumber []           = ...... []
ntsNumber (ant : ants) = ...... (:) .. (ntNumber ant) .. (ntsNumber ants)

and you see it just looks like a preorder traversal, reconstructing the original data structure after processing the elements.你会看到它看起来像一个预序遍历,在处理完元素后重建原始数据结构。 We're doing the right thing with the values , provided steady and $$ are correctly combining the processes .我们正在对做正确的事情,前提是steady$$正确地结合了流程

We could try to do the same for aNumber我们可以尝试对aNumber做同样的事情

aNumber  :: a -> Numbering a
aNumber a = steady (,) $$ steady a $$ ????

but the ????但是???? is where we actually need the number.是我们真正需要数字的地方。 We could build a numbering process that fits in that hole: a numbering process that issues the next number .我们可以建立一个适合那个洞的编号过程:一个发出下一个数字的编号过程。

next :: Numbering Int
next i = (i, i + 1)

That's the essence of numbering, the "thing" output is the number to be used now (which is the starting number), and the "next" number output is the one after.这就是编号的本质,“事物”输出的是现在要使用的数字(即起始数字),“下一个”数字输出是后面的数字。 We may write我们可能会写

aNumber a = steady (,) $$ steady a $$ next

which simplifies to这简化为

aNumber a = steady ((,) a) $$ next

In our filtered view, that's在我们的过滤视图中,那是

aNumber a = ...... ((,) a) .. next

What we've done is to bottle the idea of a "numbering process" and we've built the right tools to do ordinary functional programming with those processes.我们所做的是将“编号过程”的概念封装起来,并且我们已经构建了正确的工具来对这些过程进行普通的函数式编程 The threading pattern turns into the definitions of steady and $$ .线程模式变成了steady$$的定义。

Numbering is not the only thing that works this way.编号并不是唯一以这种方式工作的东西。 Try this...试试这个...

> :info Applicative
class Functor f => Applicative (f :: * -> *) where
  pure :: a -> f a
  (<*>) :: f (a -> b) -> f a -> f b

...and you also get some more stuff. ......你还会得到更多的东西。 I just want to draw attention to the types of pure and <*> .我只想提请注意pure<*>的类型。 They're a lot like steady and $$ , but they are not just for Numbering .它们很像steady$$ ,但它们不仅仅用于Numbering Applicative is the type class for every kind of process which works that way. Applicative是用于以这种方式工作的每种进程的类型类。 I'm not saying "learn Applicative now!", just suggesting a direction of travel.我不是说“现在学习Applicative !”,只是建议一个旅行方向。

Third Attempt: Type-Directed Numbering第三次尝试:类型导向编号

So far, our solution is directed towards one particular data structure, NT a , with [NT a] showing up as an auxiliary notion because it's used in NT a .到目前为止,我们的解决方案是针对一个特定的数据结构NT a ,其中[NT a]显示为辅助概念,因为它在NT a We can make the whole thing a bit more plug-and-play if we focus on one layer of the type at a time.如果我们一次专注于类型的一个层,我们可以使整个事情变得更加即插即用。 We defined numbering a list of trees in terms of numbering trees.我们根据编号树定义了对树的列表进行编号。 In general, we know how to number a list of stuff if we know how to number each item of stuff .一般而言,如果我们知道如何对每一项内容进行编号,我们就知道如何对一系列内容进行编号。

If we know how to number an a to get b , we should be able to number a list of a to get a list of b .如果我们知道如何数的a得到b ,我们应该能够数列表a拿到的名单b We can abstract over "how to process each item".我们可以抽象出“如何处理每个项目”。

listNumber :: (a -> Numbering b) -> [a] -> Numbering [b]
listNumber na []       = steady []
listNumber na (a : as) = steady (:) $$ na a $$ listNumber na as

and now our old list-of-trees-numbering function becomes现在我们旧的树列表编号函数变成了

ntsNumber :: [NT a] -> Numbering [NT (a, Int)]
ntsNumber = listNumber ntNumber

which is hardly worth naming.这几乎不值得命名。 We can just write我们可以写

ntNumber :: NT a -> Numbering (NT (a, Int))
ntNumber (N a ants) = steady N $$ aNumber a $$ listNumber ntNumber ants

We can play the same game for the trees themselves.我们可以为树木本身玩同样的游戏。 If you know how to number stuff, you know how to number a tree of stuff.如果你知道如何给东西编号,你就知道如何给一棵东西树编号。

ntNumber' :: (a -> Numbering b) -> NT a -> Numbering (NT b)
ntNumber' na (N a ants) = steady N $$ na a $$ listNumber (ntNumber' na) ants

Now we can do things like this现在我们可以做这样的事情

myTree :: NT [String]
myTree = N ["a", "b", "c"] [N ["d", "e"] [], N ["f"] []]

> ntNumber' (listNumber aNumber) myTree 0
(N [("a",0),("b",1),("c",2)] [N [("d",3),("e",4)] [],N [("f",5)] []],6)

Here, the node data is now itself a list of things, but we've been able to number those things individually.在这里,节点数据现在本身就是一个事物列表,但我们已经能够单独为这些事物编号。 Our equipment is more adaptable because each component aligns with one layer of the type.我们的设备适应性更强,因为每个组件都与该类型的一层对齐。

Now, try this:现在,试试这个:

> :t traverse
traverse :: (Applicative f, Traversable t) => (a -> f b) -> t a -> f (t b)

It's an awful lot like the thing we just did, where f is Numbering and t is sometimes lists and sometimes trees.这与我们刚刚做的事情非常相似,其中fNumberingt有时是列表,有时是树。

The Traversable class captures what it means to be a type-former that lets you thread some sort of process through the stored elements. Traversable类捕获了作为类型形成器的含义,它允许您通过存储的元素线程化某种进程。 Again, the pattern you're using is very common and has been anticipated.同样,您使用的模式非常普遍,并且是预料之中的。 Learning to use traverse saves a lot of work.学习使用traverse可以节省大量工作。

Eventually...最终...

...you'll learn that a thing to do the job of Numbering already exists in the library: it's called State Int and it belongs to the Monad class, which means it must also be in the Applicative class. ...您将了解到库中已经存在一个可以完成Numbering工作的东西:它称为State Int ,它属于Monad类,这意味着它也必须在Applicative类中。 To get hold of it,为了掌握它,

import Control.Monad.State

and the operation which kicks off a stateful process with its initial state, like our feeding-in of 0 , is this thing:启动一个有状态进程的初始状态的操作,就像我们输入的0 ,是这样的:

> :t evalState
evalState :: State s a -> s -> a

Our next operation becomes我们的next操作变成

next' :: State Int Int
next' = get <* modify (1+)

where get is the process that accesses the state, modify makes a process that changes the state, and <* means "but also do".其中get是访问状态的过程, modify是改变状态的过程, <*表示“但也做”。

If you start you file with the language extension pragma如果您使用语言扩展 pragma 启动文件

{-# LANGUAGE DeriveFunctor, DeriveFoldable, DeriveTraversable #-}

you can declare your datatype like this你可以像这样声明你的数据类型

data NT a = N a [NT a] deriving (Show, Functor, Foldable, Traversable)

and Haskell will write traverse for you. Haskell 会为你写traverse

Your program then becomes one line...你的程序然后变成一行......

evalState (traverse (\ a -> pure ((,) a) <*> get <* modify (1+)) ntree) 0
--                  ^ how to process one element ^^^^^^^^^^^^^^^
--         ^ how to process an entire tree of elements ^^^^^^^^^
--        ^ processing your particular tree ^^^^^^^^^^^^^^^^^^^^^^^^^^^
-- ^ kicking off the process with a starting number of 0 ^^^^^^^^^^^^^^^^

...but the journey to that one line involves a lot of "bottling the pattern" steps, which takes some (hopefully rewarding) learning. ...但到那一行的旅程涉及许多“装瓶模式”的步骤,这需要一些(希望有所回报)学习。

I'll update this answer as soon as I get some progress.我会在取得一些进展后立即更新此答案。

Right now I reduced problem from n-ary tree to binary tree.现在我将问题从 n 叉树简化为二叉树。

data T a = Leaf a | N (T a) a (T a) deriving Show

numberNodes:: T a -> T (a,Int)
numberNodes tree = snd $ numberNodes2 tree 0

numberNodes2:: T a -> Int -> (Int,  T (a,Int))
numberNodes2 (Leaf a) time = (time,Leaf (a,time))
numberNodes2 (N left nodeVal right) time = (rightTime, N leftTree (nodeVal,time) rightTree  )
where (leftTime,leftTree) = numberNodes2 left (time+1)
      (rightTime,rightTree) = numberNodes2 right (leftTime+1)

Function numberNodes creates from this tree:函数 numberNodes 从这棵树创建:

let bt = N (N (Leaf "anna" ) "leo" (Leaf "laura")) "eric" (N (Leaf "john")  "joe" (Leaf "eddie"))

following tree:以下树:

N (N (Leaf ("anna",2)) ("leo",1) (Leaf ("laura",3))) ("eric",0) (N (Leaf ("john",5)) ("joe",4) (Leaf ("eddie",6)))

And now just rewrite it for n-ary tree...( which I don't know how to do, any hints? )现在只需将其重写为 n 叉树...(我不知道该怎么做,有什么提示吗?)

This answer by @pigworker is excellent, and I learned lots from it. @pigworker 的这个答案非常好,我从中学到了很多。

However, I believe we can use mapAccumL from Data.Traversable to achieve a very similar behaviour:但是,我相信我们可以使用mapAccumLmapAccumL来实现非常相似的行为:

{-# LANGUAGE DeriveTraversable #-}

import           Data.Traversable
import           Data.Tuple

-- original data type from the question
data NT a = N a [NT a]
    deriving (Show, Functor, Foldable, Traversable)

-- additional type from @pigworker's answer
type Numbering output = Int -> (output, Int)

-- compare this to signature of ntNumber
-- swap added to match the signature
ntNumberSimple :: (NT a) -> Numbering (NT (a, Int))
ntNumberSimple t n = swap $ mapAccumL func n t
    where
        func i x = (i+1, (x, i))

I believe that mapAccumL is using the very same State monad under the hood, but at the very least it's completely hidden from the caller.我相信mapAccumL在引擎盖下使用了完全相同的 State monad,但至少它对调用者是完全隐藏的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM