简体繁体 English

为有限自动机识别的语言推导出正则语法

[英]Deriving a regular grammar for the language recognised by the Finite Automata

原文 2017-10-30 04:49:02 8 2 context-free-grammar/ regular-language/ automata/ finite-automata/ computation-theory

I'm having trouble figuring out on deriving a regular grammar for a language that is recognised by a Finite Automata.我无法为有限自动机识别的语言推导出常规语法。 The key issue I'm facing is the confusion between a regular grammar and a context free grammar.我面临的关键问题是常规语法和上下文无关语法之间的混淆。 I can't seem to distinguish the difference between both of them and i find them very similar in some aspects such as ambiguity.我似乎无法区分它们之间的区别，我发现它们在某些方面非常相似，例如歧义。

Could anyone please explain on how to derive a regular grammar for the language recognised by an FA?谁能解释一下如何为FA识别的语言推导出常规语法？

2 个解决方案

When we speak of regular grammars , we might be talking about (strictly) regular grammars or (extended) regular grammars .当我们谈到正则文法时，我们可能是在谈论（严格的）正则文法或（扩展的）正则文法。 These distinct concepts correspond more or less exactly to DFAs and to generalized NFAs with empty transitions, respectively.这些不同的概念或多或少地分别对应于 DFA 和具有空转换的广义 NFA。

Furthermore, regular grammars are either right-regular or left-regular .此外，正则文法是right-regular或left-regular 。 I find right-regular grammars to be altogether easier to think about, but your mileage may vary.我发现正确的正则语法更容易思考，但你的里程可能会有所不同。

Given a DFA, a (strictly) right-regular grammar can be produced as follows:给定一个 DFA，一个（严格的）右正则文法可以产生如下：

N = Q; N = Q; the set of nonterminals of the grammar is the set of states of the DFA.文法的非终结符集是 DFA 的状态集。
S = q0; S = q0; the start symbol of the grammar is the initial state of the DFA.文法的起始符号是 DFA 的初始状态。
P will contain a production X := aY for nonterminals X and Y and alphabet symbol a if and only if there is a transition in the DFA from state X to state Y on input a. P 将包含非终结符 X 和 Y 的产生式 X := aY 和字母符号 a 当且仅当 DFA 中在输入 a 上存在从状态 X 到状态 Y 的转换。
P will contain a production X := a for nonterminal X and alphabet symbol a if and only if there is a transition in the DFA from state X to some accepting state on input a. P 将包含非终结符 X 和字母符号 a 的产生式 X := a 当且仅当 DFA 中存在从状态 X 到输入 a 的某个接受状态的转换。
P will contain a production q0 := e if and only if the state q0 is accepting in the DFA.当且仅当状态 q0 在 DFA 中接受时，P 将包含产生式 q0 := e。

The above construction attempts to avoid adding unnecessary empty productions.上述构造试图避免添加不必要的空产生式。 If we don't mind having lots of empty productions, an alternative is to dispense with step 4 and in step 5, add transitions X := e if and only if X is an accepting state.如果我们不介意有很多空的产生式，另一种方法是省去第 4 步，在第 5 步中添加转换 X := e 当且仅当 X 是接受状态。 This has the same effect.这具有相同的效果。

Given a generalized NFA with empty transitions, an (extended) right-regular grammar can be produced as follows:给定一个带有空转换的广义 NFA，一个（扩展的）右正则文法可以产生如下：

N = Q; N = Q; the set of nonterminals of the grammar is the set of states of the gNFA-e.文法的非终结符集是 gNFA-e 的状态集。
S = q0; S = q0; the start symbol of the grammar is the initial state of the gNFA-e.文法的起始符号是 gNFA-e 的初始状态。
P will contain a production X := wY for nonterminals X and Y and string w over the alphabet if and only if there is a transition in the gNFA-e from state X to state Y on input w.当且仅当 gNFA-e 在输入 w 上存在从状态 X 到状态 Y 的转换时，P 将包含非终结符 X 和 Y 的产生式 X := wY 和字母表上的字符串 w。
P will contain a production X := e if and only if the state X is accepting in the gNFA-e.当且仅当状态 X 在 gNFA-e 中接受时，P 将包含产生式 X := e。

Basically, as in rici's linked answer, regular grammars are just an alternate notation for the same underlying information as is present in finite automata.基本上，就像在 rici 的链接答案中一样，常规语法只是有限自动机中存在的相同基础信息的替代符号。 This is fundamentally different from, say, regular expressions which are a fundamentally different (however equivalent) notation for representing regular languages.这与正则表达式根本不同，正则表达式是表示正则语言的根本不同（但等效）的符号。

The way I understand it is that CFLs are a good way for describing infinite sets in a finite way and also for describing the syntax of languages.我的理解是，CFL 是一种以有限方式描述无限集以及描述语言语法的好方法。

CFLs and regular languages... All regular languages are context free, but not necessarily vice versa. CFL 和常规语言...所有常规语言都是上下文无关的，但不一定反之亦然。 Why?为什么？

We can prove this by using the pumping lemma , and pumping on the Context Free Language described by {a^nb^n |我们可以通过使用抽水引理来证明这一点，并使用 {a^nb^n | 描述的上下文无关语言来证明这一点。 n ≥ 0} to show that it is not regular but it is a CFL because it is generated by the grammar G = (V, Σ, R, Start) where: n ≥ 0} 表明它不是正则但它是 CFL，因为它是由语法 G = (V, Σ, R, Start) 生成的，其中：

V: a finite set of variables or nonterminals Eg V = {S} V：一组有限的变量或非终结符 Eg V = {S}
Σ: a finite set which is disjoint from V, called the alphabet or terminals Eg Σ = {a,b} Σ：与 V 不相交的有限集，称为字母表或终结符 Eg Σ = {a,b}
R: is a set of production rules with each rules Eg R = {S → aSb, S → ε} R：是一组产生式规则，每个规则 Eg R = {S → aSb, S → ε}
S: the start variable Eg Start = {S} S：起始变量 Eg Start = {S}

Note that a string w is derived ambiguously in a context-free grammar G if it has two or more different leftmost derivations .请注意，如果字符串 w 有两个或多个不同的最左派生，则它在上下文无关文法G 中是歧义派生的。 A grammar G is ambiguous if it generates some string ambiguously and sometimes, when we have an ambiguous grammar, we can find an unambiguous grammar that generates the same language.如果文法G产生歧义的字符串，则文法G是歧义的，有时，当我们有歧义文法时，我们可以找到生成相同语言的无歧义文法。 Note that some context-free languages can only be generated by ambiguous grammars - known as inherently ambiguous .请注意，某些上下文无关语言只能由二义性语法生成 - 称为固有二义性。

Also, any context-free language is generated by a context-free grammar in Chomsky Normal Form .此外，任何上下文无关语言都是由乔姆斯基范式中的上下文无关文法生成的。 To check whether a string is part of a CFL we can use the Cocke-Younger-Kasami algorithm .要检查字符串是否是 CFL 的一部分，我们可以使用Cocke-Younger-Kasami 算法。

A good read is Sipser, M. (2006). Sipser, M. (2006)是一本很好的读物。 Introduction to the Theory of Computation (Vol. 2).计算理论导论（第 2 卷）。 Boston: Thomson Course Technology.波士顿：汤姆森课程技术。