简体   繁体   English

如何枚举Haskell中的递归数据类型?

[英]How to enumerate a recursive datatype in Haskell?

This blog post has an interesting explanation of how to use the Omega monad to enumerate an arbitrary grammar diagonally. 这篇博客文章有趣地解释了如何使用Omega monad对角枚举任意语法。 He offers an example of how to do so, resulting in an infinite sequence of strings. 他提供了一个示例,说明了如何执行此操作,从而产生了无限的字符串序列。 I'd like to do the same, except that, instead of generating a list of strings, it generates a list of an actual datatype. 我想做同样的事情,除了它不是生成字符串列表,而是生成实际数据类型的列表。 For example, 例如,

 data T = A | B T | C T T

Would generate 会产生

A, B A, C A A, C (B A) A... 

Or something similar. 或类似的东西。 Unfortunately my Haskell skills are still maturing and after some hours playing it I couldn't manage to do what I want. 不幸的是,我的Haskell技能仍在不断成熟,玩了几个小时后,我无法做自己想做的事。 How can that be done? 那怎么办?

As requested, one of my attempts (I have tried too many things...): 根据要求,我的尝试之一(我尝试了太多事情...):

import Control.Monad.Omega

data T = A | B T | C T T deriving (Show)

a = [A] 
        ++ (do { x <- each a; return (B x) })
        ++ (do { x <- each a; y <- each a; return (C x y) })

main = print $ take 10 $ a

My first ugly approach was: 我的第一个丑陋的做法是:

allTerms :: Omega T
allTerms = do
  which <- each [ 1,2,3 ]
  if which == 1 then
    return A
  else if which == 2 then do
    x <- allTerms
    return $ B x
  else do
    x <- allTerms
    y <- allTerms
    return $ C x y

But then, after some cleaning up I reached this one liner 但是然后,经过一番清理,我到达了这只内胆

import Control.Applicative
import Control.Monad.Omega
import Control.Monad

allTerms :: Omega T
allTerms = join $ each [return A, B <$> allTerms, C <$> allTerms <*> allTerms]

Note that order matters: return A has to be the first choice in the list above, or allTerms will not terminate. 请注意,顺序很重要: return A必须是上面列表中的首选,否则allTerms不会终止。 Basically, the Omega monad ensures a "fair scheduling" among choices, saving you from eg infiniteList ++ something , but does not prevent infinite recursion. 基本上, Omega monad确保选择之间的“合理调度”,从而使您免于infiniteList ++ something ,但并不能阻止无限递归。


An even more elegant solution was suggested by Crazy FIZRUK , exploiting the Alternative instance of Omega . Crazy FIZRUK提出了一种更为优雅的解决方案,该方法利用了OmegaAlternative实例。

import Control.Applicative
import Data.Foldable (asum)
import Control.Monad.Omega

allTerms :: Omega T
allTerms = asum [ pure A
                , B <$> allTerms
                , C <$> allTerms <*> allTerms
                ]

I finally found the time to write a generic version. 我终于找到了编写通用版本的时间。 It uses the Universe typeclass, which represents recursively enumerabley types. 它使用Universe类型类,该类型类表示递归可枚举类型。 Here it is: 这里是:

{-# LANGUAGE DeriveGeneric, TypeOperators, ScopedTypeVariables #-}
{-# LANGUAGE FlexibleInstances, FlexibleContexts #-}
{-# LANGUAGE UndecidableInstances, OverlappingInstances #-}

import Data.Universe
import Control.Monad.Omega
import GHC.Generics
import Control.Monad (mplus, liftM2)

class GUniverse f where
    guniverse :: [f a]

instance GUniverse U1 where
    guniverse = [U1]

instance (Universe c) => GUniverse (K1 i c) where
    guniverse = fmap K1 (universe :: [c])

instance (GUniverse f) => GUniverse (M1 i c f) where
    guniverse = fmap M1 (guniverse :: [f p])

instance (GUniverse f, GUniverse g) => GUniverse (f :*: g) where
    guniverse = runOmega $ liftM2 (:*:) ls rs
        where ls = each (guniverse :: [f p])
              rs = each (guniverse :: [g p])

instance (GUniverse f, GUniverse g) => GUniverse (f :+: g) where
    guniverse = runOmega $ (fmap L1 $ ls) `mplus` (fmap R1 $ rs)
        where ls = each (guniverse :: [f p])
              rs = each (guniverse :: [g p])

instance (Generic a, GUniverse (Rep a)) => Universe a where
    universe = fmap to $ (guniverse :: [Rep a x])


data T = A | B T | C T T deriving (Show, Generic)
data Tree a = Leaf a | Branch (Tree a) (Tree a) deriving (Show, Generic)

I couldn't find a way to remove UndecidableInstances , but that should be of no greater concern. 我找不到删除UndecidableInstances ,但这应该不再引起人们的关注。 OverlappingInstances is only required to override predefined Universe instances, like Either 's. 仅需要OverlappingInstances来覆盖预定义的Universe实例,例如Either的。 Now some nice outputs: 现在有一些不错的输出:

*Main> take 10 $ (universe :: [T])
[A,B A,B (B A),C A A,B (B (B A)),C A (B A),B (C A A),C (B A) A,B (B (B (B A))),C A (B (B A))]
*Main> take 20 $ (universe :: [Either Int Char])
[Left (-9223372036854775808),Right '\NUL',Left (-9223372036854775807),Right '\SOH',Left (-9223372036854775806),Right '\STX',Left (-9223372036854775805),Right '\ETX',Left (-9223372036854775804),Right '\EOT',Left (-9223372036854775803),Right '\ENQ',Left (-9223372036854775802),Right '\ACK',Left (-9223372036854775801),Right '\a',Left (-9223372036854775800),Right '\b',Left (-9223372036854775799),Right '\t']
*Main> take 10 $ (universe :: [Tree Bool])
[Leaf False,Leaf True,Branch (Leaf False) (Leaf False),Branch (Leaf False) (Leaf True),Branch (Leaf True) (Leaf False),Branch (Leaf False) (Branch (Leaf False) (Leaf False)),Branch (Leaf True) (Leaf True),Branch (Branch (Leaf False) (Leaf False)) (Leaf False),Branch (Leaf False) (Branch (Leaf False) (Leaf True)),Branch (Leaf True) (Branch (Leaf False) (Leaf False))]

I'm not exactly sure what happens in the branching order of mplus , but I think it should all work out if Omega is correctly implemented, which I strongly believe. 我不确定在mplus的分支顺序中会发生什么,但是我坚信Omega如果正确实施,应该都能解决。


But wait! 可是等等! The above implementation is not yet bug-free; 上面的实现还没有错误。 it diverges on "left recursive" types, like this: 它在“左递归”类型上有所不同,如下所示:

data T3 = T3 T3 | T3' deriving (Show, Generic)

while this works: 虽然这有效:

data T6 = T6' | T6 T6 deriving (Show, Generic)

I'll see if I can fix that. 我看看是否可以解决。 EDIT: At some time, the solution of this problem might be found in this question . 编辑:有时,可能会在此问题中找到此问题的解决方案。

You really should show us what you have tried so far. 您确实应该向我们展示您到目前为止所做的尝试。 But granted, this is not an easy problem for a bgeinner. 但是理所当然的是,这对于bgeinner而言并非易事。

Let's try to write a naive version down: 让我们尝试写下一个朴素的版本:

enum = A : (map B enum ++ [ C x y | x <- enum, y <- enum ])

Ok, this actually gives us: 好的,这实际上给了我们:

[A, B A, B (B A), B (B (B A)), .... ]

and never reaches the C values. 并且永远不会达到C值。

We obviously need to construct the list in steps. 我们显然需要分步构造列表。 Say we already have a complete list of items up to a certain nesting level, we can compute the items with one nesting level more in one step: 假设我们已经有一个特定嵌套级别的项目的完整列表,那么我们可以一步计算出一个嵌套级别的项目:

step xs = map B xs ++ [ C x y | x <- xs, y <- xs ]

For example, we get: 例如,我们得到:

> step [A]
[B A,C A A]
> step (step [A])
[B (B A),B (C A A),C (B A) (B A),C (B A) (C A A),C (C A A) (B A),C (C A A) (C A ...

What we want is thus: 因此,我们想要的是:

[A] ++ step [A] ++ step (step [A]) ++ .....

which is the concatenation of the result of 这是结果的串联

iterate step [A]

which is, of course 当然是

someT = concat (iterate step [A])

Warning : You will notice that this still does not give all values. 警告 :您会注意到,这仍然不能提供所有值。 For example: 例如:

C A (B (B A))

will be missing. 将丢失。

Can you find out why? 你能找出原因吗? Can you improve it? 你可以改善吗?

Below is a terrible solution, but perhaps an interesting one. 下面是一个糟糕的解决方案,但也许是一个有趣的解决方案。


We might consider the idea of adding "one more layer" 我们可能会考虑添加“一层以上”的想法

grow :: T -> Omega T
grow t = each [A, B t, C t t]

which is close to correct but has a flaw—in particular, in the C branch, we end up having both of the arguments take the exact same values instead of being able to vary independently. 这几乎是正确的,但有缺陷-特别是在C分支中,我们最终使两个参数都使用完全相同的值,而不是能够独立地变化。 We can fix this by computing the "base functor" of T which looks like this 我们可以通过计算如下的T的“基函子”来解决此问题

data T    = A  | B  T | C  T T
data Tf x = Af | Bf x | Cf x x deriving Functor

In particular, Tf is just a copy of T where the recursive calls are functor "holes" instead of direct recursive calls. 特别是, Tf只是T的副本,其中递归调用是函子“漏洞”,而不是直接递归调用。 Now we can write: 现在我们可以写:

grow :: Omega T -> Omega (Tf (Omega T))
grow ot = each [ Af, Bf ot, Cf ot ot ]

which has a whole computation of a new set of T in each hole. 每个孔中都有一组新的T的整体计算。 If we could somehow "flatten" the Omega (Tf (Omega T)) into Omega T then we'd have a computation which adds "one new layer" to our Omega computation correctly. 如果我们能以某种方式将Omega (Tf (Omega T)) “压平”到Omega T那么我们将有一个计算,可以在Omega计算中正确添加“一个新层”。

flatten :: Omega (Tf (Omega T)) -> Omega T
flatten = ...

and we could take the fixed point of this layering with fix 我们可以借此与层次感的定点fix

fix :: (a -> a) -> a

every :: Omega T
every = fix (flatten . grow)

So the only trick is to figure out flatten . 因此,唯一的技巧就是flatten To do this we need to notice two features of Tf . 为此,我们需要注意Tf两个功能。 First, it's Traversable so we can use sequenceA to "flip" the order of Tf and Omega 首先,它是可Traversable因此我们可以使用sequenceA来“翻转” TfOmega的顺序

flatten = ?f . fmap (?g . sequenceA)

where ?f :: Omega (Omega T) -> Omega T is just join . 其中?f :: Omega (Omega T) -> Omega T刚刚join The final tricky bit is figuring out ?g :: Omega (Tf T) -> Omega T . 最后一个棘手的问题是找出?g :: Omega (Tf T) -> Omega T Obviously, we don't care about the Omega layer so we should just fmap a function of type Tf T -> T . 显然,我们不在乎Omega层,因此我们应该fmap一个Tf T -> T类型的函数。

And this function is very close to the defining notion for the relationship between Tf and T : we can always compress a layer of Tf on the top of T . 而这个功能是非常接近的关系的界定概念TfT :我们总能压缩的层Tf上的顶部T

compress :: Tf T -> T
compress Af         = A
compress (Bf t)     = B t
compress (Cf t1 t2) = C t1 t2

All together we have 我们在一起

flatten :: Omega (Tf (Omega T)) -> Omega T
flatten = join . fmap (fmap compress . sequenceA)

Ugly, but all together functional. 丑陋,但一起发挥作用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM