简体   繁体   English

此无上下文语言枚举器的伪代码实现是什么?

[英]What is a pseudocode implementation of this context-free language enumerator?

This blog post in Haskell explains how context-free grammars can be enumerated with aid of a monad called Omega. Haskell上的此博客文章介绍了如何借助名为Omega的单子语法来枚举无上下文语法。

I could not understand how this works partly due to the lack of an explanation on how that monad works, but mostly due to the fact I can't understand monads. 我之所以无法理解它的工作原理,部分原因是缺乏对该monad的工作原理的解释,但主要是由于我听不懂monad的事实。 What is a proper pseudo-code explanation of that algorithm, without monads ? 没有monad ,该算法的正确伪代码解释是什么?

Using a syntax similar to a simple, common language such as JavaScript or Python would be preferred. 最好使用类似于JavaScript或Python这样的简单通用语言的语法。

Here's a Haskell version without the monad. 这是没有单子的Haskell版本。 I do use list comprehensions but those are more intuitive and you have them in Python as well. 我确实使用列表推导,但是这些更直观,您也可以在Python中使用它们。

The Omega type is just a wrapper around [] , but it helps to keep the "string of symbols" and the "list of possible strings" concepts separate. Omega类型只是[]的包装,但是有助于将“符号字符串”和“可能字符串列表”的概念分开。 Since we're not going to use Omega for "list of possible strings", let's use a newtype wrapper for "string of symbols" to keep everything straight: 由于我们不打算使用Omega的“可能的字符串列表”,让我们使用newtype包装器“的符号串”把一切都直:

import Prelude hiding (String)

-- represent a sequence of symbols of type `a`,
-- i.e. a string recognised by a grammar over `a`
newtype String a = String [a]
    deriving Show

-- simple wrapper for (++) to also make things more explicit when we use it
joinStrings (String a1) (String a2) = String (a1 ++ a2)

Here's the Symbol type from the blogpost: 这是博客文章中的Symbol类型:

data Symbol a
    = Terminal a
    | Nonterminal [[Symbol a]] -- a disjunction of juxtapositions

The core of the Omega monad is actually the diagonal function: Omega monad的核心实际上是diagonal函数:

-- | This is the hinge algorithm of the Omega monad,
-- exposed because it can be useful on its own.  Joins 
-- a list of lists with the property that for every x y 
-- there is an n such that @xs !! x !! y == diagonal xs !! n@.
diagonal :: [[a]] -> [a]

Given this, enumerate from the blogpost was: 鉴于此,从博客文章中enumerate的是:

enumerate :: Symbol a -> Omega [a]
enumerate (Terminal a) = return [a]
enumerate (Nonterminal alts) = do
    alt <- each alts          -- for each alternative
      -- (each is the Omega constructor :: [a] -> Omega a)
    rep <- mapM enumerate alt -- enumerate each symbol in the sequence
    return $ concat rep       -- and concatenate the results

Our enumerate will have this type: 我们的enumerate将具有以下类型:

enumerate :: Symbol a -> [String a]

The Terminal case is easy: Terminal盒很简单:

enumerate (Terminal a) = [String [a]]

In the Nonterminal case a helper function for each alternative will be useful: 在非Nonterminal情况下,每个替代项的帮助函数将很有用:

-- Enumerate the strings accepted by a sequence of symbols
enumerateSymbols :: [Symbol a] -> [String a]

The base case is quite easy, though the result isn't [] , but a singleton result containing the empty string: 基本结果非常简单,尽管结果不是[] ,但是包含空字符串的单例结果:

enumerateSymbols [] = [String []]

For the non-empty case another helper will be useful to pair up the strings from the head and from the tail in all possible ways, using diagonal : 对于非空的情况,可以使用diagonal ,以所有可能的方式用另一个助手将头部和尾部的弦配对起来:

crossProduct :: [a] -> [b] -> [(a, b)]
crossProduct as bs = diagonal [[(a, b) | b <- bs] | a <- as]

I could also have written [[(a, b) | a <- as] | b <- bs] 我也可以写[[(a, b) | a <- as] | b <- bs] [[(a, b) | a <- as] | b <- bs] [[(a, b) | a <- as] | b <- bs] but I chose the other because that ends up replicating the output from the blogpost. [[(a, b) | a <- as] | b <- bs]但我选择了另一个,因为最终复制了博客文章的输出。

Now we can write the non-empty case for enumerateSymbols : 现在我们可以为enumerateSymbols编写非空的情况:

enumerateSymbols (sym:syms) =
    let prefixes = enumerate sym
        suffixes = enumerateSymbols syms
    in [joinStrings prefix suffix 
           | (prefix, suffix) <- crossProduct prefixes suffixes]

and now the non-empty case for enumerate : 现在是enumerate的非空情况:

enumerate (Nonterminal alts) =
    -- get the list of strings for each of the alternatives
    let choices = map enumerateSymbols alts
    -- and use diagonal to combine them in a "fair" way
    in diagonal choices

Here's the body of diagonal from the Omega source, with my explanations: 这是来自欧米茄的diagonal ,有我的解释:

diagonal = diagonal' 0
    where

    -- strip n xss returns two lists,
    -- the first containing the head of each of the first n lists in xss,
    -- the second containing the tail of the first n lists in xss
    -- and all of the remaining lists in xss.
    -- empty lists in xss are ignored
    stripe 0 xss          = ([],xss)
    stripe n []           = ([],[])
    stripe n ([]:xss)     = stripe n xss
    stripe n ((x:xs):xss) = 
        let (nstripe, nlists) = stripe (n-1) xss
        in (x:nstripe, xs:nlists)


    -- diagonal' n xss uses stripe n to split up
    -- xss into a chunk of n elements representing the
    -- nth diagonal of the original input, and the rest
    -- of the original input for a recursive call to
    -- diagonal' (n+1)

    diagonal' _ [] = []
    diagonal' n xss =
        let (str, xss') = stripe n xss
        in str ++ diagonal' (n+1) xss'

It's also worth reading this paper about the general concept of diagonalization and breadth-first search of an infinite structure. 还值得阅读有关对角化和无限结构的广度优先搜索的一般概念的本文

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用无上下文语言测试成员资格 - Testing membership in context-free language 学习野牛:什么是无语境语法和LALR(1)? - Learning bison: What are context-free grammars and LALR(1)? 从非递归上下文无关语法生成有限语言的算法 - Algorithm to generate finite language from non-recursive context-free grammar 解析令牌流中的无上下文语言 - Parsing context-free languages in a stream of tokens 上下文无关文法与上下文敏感文法? - Context-free grammars versus context-sensitive grammars? 令 L1={a^nb^mc^(n+m) / n,m &gt; 0} 且 L2={a^nb^nc^m / n,m &gt; 0}。L3= L1 ∩ L2 上下文无关或不? - Let L1={a^nb^mc^(n+m) / n,m > 0} and L2={a^nb^nc^m / n,m > 0}.Is L3= L1 ∩ L2 context-free or not? 是否有快速算法来确定上下文无关语言的godel数? - Is there a fast algorithm to determine the godel number of a term of a context free language? 检查上下文无关文法是否生成DFA拒绝的无限语言的算法 - Algorithm that checks if a context free grammar generates infinite language that a DFA rejects 这个二叉树的伪代码是什么 - What is the pseudocode for this binary tree 这个递推关系的伪代码是什么 - What is the pseudocode for this recurrence relation
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM