[英]How to construct a CFG based on a given regular expression
I am trying to figure out how to construct a CFG (context free grammar) based on a given regular expression. 我试图弄清楚如何基于给定的正则表达式构造CFG(无上下文语法)。 For example, a(ab)*(a|b) I think there is an algorithm to go through, but it is really confusing.
例如,a(ab)*(a | b)我认为有一个算法可以通过,但它确实令人困惑。 here is what i got so far:
这是我到目前为止所得到的:
S->aAB;
A->aAb|empty;
B->a|b;
Does this look right? 这看起来不错吗? Any help would be appreciated.
任何帮助,将不胜感激。
Construct the CFG in three parts, each for a
, (ab)*
and (a|b)
. 将CFG分为三个部分,分别为
a
, (ab)*
和(a|b)
。
For (a|b)
, you've got B -> a | b
对于
(a|b)
,你有B -> a | b
B -> a | b
right. B -> a | b
对。
(ab)*
would mean strings like ab
, abab
, ababab
and so on. (ab)*
意思是ab
, abab
, ababab
等字符串。 So A -> abA | empty
所以
A -> abA | empty
A -> abA | empty
would be the correct production. A -> abA | empty
将是正确的生产。
Hence, the full grammar becomes: 因此,完整的语法变为:
S -> aAB
A -> abA | empty
B -> a | b
Note: A -> aAb | empty
注意:
A -> aAb | empty
A -> aAb | empty
would derive strings like ab
, aabb
, aaabbb
and so on, which is not a regular language , and can't possibly represent a regular expression . A -> aAb | empty
会导出像ab
, aabb
, aaabbb
等字符串,这不是常规语言 ,也不可能代表正则表达式 。
Another way to construct a context-free grammar for a given regular expression is: 为给定正则表达式构造无上下文语法的另一种方法是:
X -> t Y
for every state-machine transition from state X to state Y on terminal symbol t. X -> t Y
形式的规则状态机在终端符号t上从状态X转换到状态Y. If your CFG notation allows it, each final state F gets a rule of the form F -> epsilon
. F -> epsilon
形式的规则。 If your CFG notation doesn't allow such rules, then for each transition from state X to final state F on terminal t, produce the rule X -> t
(in addition to the rule X -> t F
already described). X -> t
(除了已经描述的规则X -> t F
)。 The result is a regular grammar, a context-free grammar that obeys the additional constraint that each right-hand side has at most one non-terminal. For the example given, assume we construct the following FSA (of the many that accept the same language as the regular expression): 对于给出的示例,假设我们构造了以下FSA(许多接受与正则表达式相同的语言):
From this, it is straightforward to derive the following regular grammar: 由此,可以直接推导出以下常规语法:
S -> a A1
A1 -> a A2
A2 -> b B3
B3 -> a A2
B3 -> a A4
B3 -> b B5
A1 -> a A4
A1 -> b B5
A4 -> epsilon
B5 -> epsilon
epsilon ->
Or, if we don't want rules with an empty right-hand side, drop the last three rules of that grammar and add: 或者,如果我们不想要具有空右侧的规则,请删除该语法的最后三个规则并添加:
A1 -> a
A1 -> b
B3 -> a
B3 -> b
Compared with other approaches, this method has the disadvantage that the resulting grammar is more verbose than it needs to be, and the advantage that the derivation can be entirely mechanical, which means it's easier to get right without having to think hard. 与其他方法相比,这种方法的缺点是得到的语法比它需要的更冗长,并且推导可以完全机械化的优点,这意味着它更容易正确而不必刻意思考。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.