简体   繁体   English

从非递归上下文无关语法生成有限语言的算法

[英]Algorithm to generate finite language from non-recursive context-free grammar

I am searching an algorithm to generate the complete finite language from a non-recursive context-free grammar. 我正在搜索一种算法,从非递归的无上下文语法生成完整的有限语言。 The application is the generation of a set of possible scenarios for test automation. 该应用程序是为测试自动化生成一组可能的方案。

Example grammar in EBNF (S is the start rule): EBNF中的语法示例(S是起始规则):

S = Sender, Receiver;
Sender = Human | Machine;
Human = "user-type-1" | "user-type-2"
Machine = Access, Protocol;
Access = "internal" | "external";
Protocol = "soap" | "smtp";
Receiver = "local" | "remote";

should produce a set of sentences like: 应该产生一组句子,如:

user-type-1 local
internal soap local
external smtp remote

Examples and literature I have found so far refered to randomised generation of examples based on recursive grammars. 到目前为止我发现的例子和文献都提到了基于递归语法的随机生成的例子。 But my problem is more simple. 但我的问题更简单。 All hints, names or links to publications are welcome. 欢迎提供出版物的所有提示,名称或链接。

Thanks, 谢谢,

S. S.

One way is to define the grammar in a programming language, and then write code to iterate over all possibilities. 一种方法是在编程语言中定义语法,然后编写代码以迭代所有可能性。 Your grammar is specified using variables and three constructions: lit(x) which represents a literal like "local" , alt(a, b, c, ...) which represents a choice of one of the alternatives a , b , c , ... and seq(a, b, ..., z) which represents a sequence of one thing from a , concatenated with one thing from b and so on. 你的语法是用变量和三个结构来指定的: lit(x)表示像"local"这样的文字, alt(a, b, c, ...)表示选择abc , ...和seq(a, b, ..., z)表示的一两件事从序列a ,从一件事级联b等。

Here's your grammar in that form. 这是你那种形式的语法。

Protocol = alt(lit("soap"), lit("smtp"))
Receiver = alt(lit("local"), lit("remote"))
Access = alt(lit("internal"), lit("external"))
Human = alt(lit("user-type-1"), lit("user-type-2"))
Machine = seq(Access, Protocol)
Sender = alt(Human, Machine)
S = seq(Sender, Receiver)

And here's some full (Python) code that uses carefully chosen definitions for alt , seq and lit to make each production rule a function that generates all of its possibilities: 这里有一些完整的(Python)代码,它使用精心选择的altseqlit定义来使每个生产规则成为生成其所有可能性的函数:

import itertools

def seq(*xs):
    def r():
        for f in itertools.product(*[x() for x in xs]):
            yield ' '.join(f)
    return r

def alt(*xs):
    def r():
        for x in xs:
            for f in x():
                yield f
    return r

def lit(x):
    def r():
        yield x
    return r

Protocol = alt(lit("soap"), lit("smtp"))
Receiver = alt(lit("local"), lit("remote"))
Access = alt(lit("internal"), lit("external"))
Human = alt(lit("user-type-1"), lit("user-type-2"))
Machine = seq(Access, Protocol)
Sender = alt(Human, Machine)
S = seq(Sender, Receiver)

for s in S():
    print s

You can recursively generate a tree whose branches will represent derivations according to the rules of the grammar and whose leaves will represent words in the language of the grammar. 您可以递归地生成一个树,其分支将根据语法规则表示派生,其叶子将表示语法语言中的单词。 Recovering the entire finite language is then as simple as saving off leaves as you generate them. 恢复整个有限语言就像在生成叶子时保存叶子一样简单。

Represent each node as an ordered collection of symbols (terminals or nonterminals). 将每个节点表示为符号的有序集合(终端或非终端)。 For each nonterminal, recursively descend to a new set of nodes where every possible replacement is made. 对于每个非终结符,递归地下降到一组新的节点,其中每个可能的替换都进行。 Continue until your list contains only terminal symbols and then output the in-order concatenation of symbols corresponding to your node. 继续,直到列表中只包含终端符号,然后输出与节点对应的符号的有序串联。 Your initial node will always be [S] . 您的初始节点将始终为[S] Example: 例:

S = Sender, Receiver;
Sender = Human | Machine;
Human = "user-type-1" | "user-type-2"
Machine = Access, Protocol;
Access = "internal" | "external";
Protocol = "soap" | "smtp";
Receiver = "local" | "remote";

[S]
 [Sender, ",", Receiver]
  [Human, ",", Receiver]
   ["user-type-1", ",", Receiver]
    ["user-type-1", ",", "local"]   ***
    ["user-type-1", ",", "remote"]  ***
   ["user-type-2", ",", Receiver]
    ["user-type-2", ",", "local"]   ***
    ["user-type-2", ",", "remote"]  ***
  [Machine, ",", Receiver]
   [Access, ",", Protocol, ",", Receiver]
    ["internal", ",", Protocol, ",", Receiver]
     ["internal", ",", "soap", ",", Receiver]
      ["internal", ",", "soap", ",", "local"]   ***
      ["internal", ",", "soap", ",", "remote"]  ***
     ["internal", ",", "smtp", ",", Receiver]
      ["internal", ",", "smtp", ",", "local"]   ***
      ["internal", ",", "smtp", ",", "remote"]  ***
    ["external", ",", Protocol, ",", Receiver]
     ["external", ",", "soap", ",", Receiver]
      ["external", ",", "soap", ",", "local"]   ***
      ["external", ",", "soap", ",", "remote"]  ***
     ["external", ",", "smtp", ",", Receiver]
      ["external", ",", "smtp", ",", "local"]   ***
      ["external", ",", "smtp", ",", "remote"]  ***

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM