简体   繁体   English

将上下文无关语法转换为正则表达式

[英]converting context free grammar into regular expression

I am currently going over CFG and saw the answer and I am not sure how they got it.我目前正在查看 CFG 并看到了答案,但我不确定他们是如何得到的。 How did they get it to convert into Regular Expression from CFG here?他们是如何把它从 CFG 转换成正则表达式的?

S -> aS|bX|a
X -> aX|bY|a
Y -> aY|a


answer:
R.E -> (a*(a+ba*a+ba*ba*a))

You should learn the basic rules that I have written in my answer "constructing an equivalent regular grammar from a regular expression" , those rules will help you in converting "a regular expression into right or left liner grammar" or "a right or left liner grammar into regular expression" - both.您应该学习我在回答“从正则表达式构建等效的正则语法”中所写的基本规则,这些规则将帮助您将“正则表达式转换为右或左线性语法”或“右或左线性”语法转换成正则表达式” - 两者。

Though, more than one regular expressions (and grammars/automata) can be possible for a language.但是,一种语言可以有多个正则表达式(和语法/自动机)。 Below, I have tried to explain how to find regular expression given in answer for the question in your textbook.下面,我试图解释如何在你的教科书中找到答案中给出的正则表达式。 Read each step precisely and linked answer(s) so that you can learn approaches to solve such questions yourself next time.准确阅读每个步骤并链接答案,以便您下次可以学习自己解决此类问题的方法。

At first step, to answering such question you should be clear "what does language this grammar generate?"第一步,要回答这样的问题,你应该清楚“这个语法产生了什么语言?” (similarly, if you have an automata then try to understand language represented by that automata). (同样,如果您有自动机,则尝试理解该自动机表示的语言)。

As I said in linked answer, grammar rules like: S → eS | e正如我在链接答案中所说,语法规则如下: S → eS | e S → eS | e are corresponding to "plus clouser" and generates strings e + . S → eS | e对应于“plus clouser”并生成字符串e + Similarly, you have three pairs of such rules to generate a + in your grammar.同样,您有三对这样的规则来在您的语法中生成a +

S → aS | a   
X → aX | a  
Y → aY | a    

(Note: a + can also be written as a * a or aa * – describes one or more 'a' .) (注意: a +也可以写成a * aaa * – 描述一个或多个'a' 。)

Also notice in grammar, you do not have any "null production" eg A → ∧ , so non-of the variable S , X or Y are nullable, that implies empty string is not a member of language of grammar, as: ε ∉ L(G).另请注意,在语法中,您没有任何“空产生式”,例如A → ∧ ,因此非变量SXY可以为空,这意味着空字符串不是语法语言的成员,如: ε ∉ L(G)。

If you notice start-variable's S productions rules:如果您注意到 start-variable 的S产生式规则:

S → aS | bX | a

Then it is very clear that strings ω in language can either start with symbol 'a' or with 'b' (as you have two choices to apply S productions either (1) S → aS | a that gives 'a' as the first symbol in ω, or (2) S → bX that use to produce strings those start with symbol 'b' ).那么很明显,语言中的字符串 ω 可以以符号'a'或以'b'开头(因为您有两种选择来应用S产生式,要么 (1) S → aS | a给出'a'作为第一个ω 中的符号,或 (2) S → bX用于生成以符号'b'开头的字符串)。

Now, what are the possible minimum length strings ω in L(G)?现在,L(G) 中可能的最小长度字符串 ω 是多少? – minimum length string is "a" that is possible using production rule: S → a . – 最小长度字符串是"a" ,可以使用产生式规则: S → a

Next note that "b" ∉ L(G) because if you apples S → bX then later on you have to replace X in sentential form bX using some of X 's production rules, and as we know X is also not nullable hence there would be always some symbol(s) after 'b' – in other words sentimental from bX derives ∣ω∣ ≥ 2.接下来需要注意的是"b" ∉L(G),因为如果苹果S → bX再后来就必须更换X句型bX使用一些X的生产规则,因为我们知道X也不能为空,因此有将总是后一些符号(一个或多个) 'b' -换句话说从感伤bX导出|ω|≥2。

Form above discussion, it is very clear that using S production rules you can generate sentential forms either a*a or a*bX , in two steps:从上面的讨论中可以看出,很明显,使用S产生式规则,您可以分两步生成a*aa*bX句子形式:

  1. For a* use S → aS repeatedly that will give S ⇝ a*S (symbol ⇝ means more than one steps)对于a*重复使用S → aS将得到S ⇝ a*S (符号 ⇝ 表示不止一步)

  2. Replace S in rhs of S ⇝ a*S to get either by a*a or a*bXS S ⇝ a*S rhs 中的S ⇝ a*S a*aa*bX

Also, " a*a or a*bX " can be written as S ⇝ a*(a + bX) or S ⇝ (a*(a + bX)) if you like to parenthesizes complete expression .此外,如果您想将完整的表达式括起来,“ a*aa*bX ”可以写成S ⇝ a*(a + bX) S ⇝ (a*(a + bX)) S ⇝ a*(a + bX)S ⇝ (a*(a + bX))

Now compare production rules of S and X both are the same!现在比较SX产生规则都一样! So as I shown above for S , you can also describe for X that it can use to generate sentential forms X ⇝ (a*(a + bY)) .因此,正如我上面对S所示,您还可以描述X可用于生成句子形式X ⇝ (a*(a + bY))

To derive the regular expressions given in answer replace X by (a*(a + bY)) in S ⇝ a*(a + bX) , you will get:要导出答案中给出的正则表达式,将X替换为(a*(a + bY)) S ⇝ a*(a + bX) (a*(a + bY)) in S ⇝ a*(a + bX) ,您将得到:

S ⇝ a*(a + b X )  
S ⇝ a*(a + b (a*(a + bY)) )

And now, last Y production rules are comparatively very simple - just use to create "plus clouser" a + (or a*a ).现在,最后的Y产生式规则相对来说非常简单 - 只需用于创建“plus clouser” a + (或a*a )。

So let's replace Y also in S derived sentential form.所以让我们也用S派生句形式替换Y

S ⇝ a*(a + b(a*(a + bY)))   
  ⇝ a*(a + b(a*(a + ba*a)))

Simplify it, apply distribution low twice to remove inner parenthesis and concatenate regular expressions – P(Q + R) can be written as PQ + PR .简化它,应用低分布两次以去除内括号并连接正则表达式 - P(Q + R)可以写为PQ + PR

  ⇝ a*(a + b(a*(a + ba*a)))     
  ⇝ a*(a + b(a*a + a*ba*a))     
  ⇝ a*(a + ba*a + ba*ba*a)

: + in regular expression in formal languages use in two syntax (i) + as binary operator means – "union operation" (ii) + as unary superscript operator means – "plus clouser" : + 在正式语言的正则表达式中使用两种语法 (i) + 作为二元运算符的意思是 – “联合运算” (ii) + 作为一元上标运算符的意思是 – “plus clouser”
: In regex in programming languages + is only uses for "plus clouser" :在编程语言中的正则表达式中 + 仅用于“plus clouser”
: In regex we use ∣ symbol for union, but that is not exactly a union operator. 在正则表达式中,我们使用 ∣ 符号表示并集,但这完全是联合运算符。 In union (A ∪ B) is same as (B ∪ A) but in regex (A ∣ B) may not equals to (B ∣ A)在联合中 (A ∪ B) 与 (B ∪ A) 相同,但在正则表达式中 (A ∣ B) 可能不等于 (B ∣ A)

What you can observe from the question is that the grammar apart from being a CFG is also right linear.您可以从问题中观察到,除了作为 CFG 之外,语法也是正确的线性。 So you can construct an finite automata for this right linear grammar.所以你可以为这个正确的线性文法构造一个有限自动机。 Now that you have the finite automata constructed their exists a regular expression with the same language and the conversion can be done using the steps given in this site .现在您已经构建了有限自动机,它们存在一个使用相同语言的正则表达式,并且可以使用本站点中给出的步骤完成转换。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM