简体繁体 English

如何将正则表达式转换为语法和 DFA

[英]How to convert regular expressions to grammar and DFA

原文 2021-04-15 14:54:15 4 1 regex/ automata/ dfa

Hi I am trying to figure out the regular grammar that represents the regular expression L = (a+ab)* as DFA and generates L.嗨，我正在尝试找出将正则表达式 L = (a+ab)* 表示为 DFA 并生成 L 的正则语法。

The picture below shows the process of expressing the expression as NFA and then converting it to DFA.下图展示了将表达式表达为NFA，然后再转换为DFA的过程。

So if you get regular grammar from DFA,因此，如果您从 DFA 获得常规语法，

A-> aB | A-> aB | bC | BC | e e

B-> aB | B-> aB | bA |巴| e e

C-> aC | C-> 交流电 | bC公元前

But the problem is, when you get the regular expression with this grammar, you get a much more complex expression, not (a+b)*.但问题是，当你用这个语法得到正则表达式时，你会得到一个更复杂的表达式，而不是 (a+b)*。

C = aC + bC = (a+b)* C = aC + bC = (a+b)*

B = aB + bA + e = a*(bA+e) B = aB + bA + e = a*(bA+e)

A = aB + bC + e = aa* bA + aa* + b(a+b) + e = (aa* b)* (aa* +b(a+b)*+e) A = aB + bC + e = aa* bA + aa* + b(a+b) + e = (aa* b)* (aa* +b(a+b)*+e)

I wonder if there is a problem with my solution.我想知道我的解决方案是否有问题。

1 个解决方案

Actually X = (aa*b)*(aa*+b(a+b)*+e) can be simplified to (a+b)*.实际上 X = (aa*b)*(aa*+b(a+b)*+e) 可以简化为 (a+b)*。 Here I tried to explain my steps for simplifying X to (a+b)*:在这里，我试图解释我将 X 简化为 (a+b)* 的步骤：

We will divide our simplification into three parts: a) empty string , b) every possible string starting with b , c) every possible string starting with a .我们将简化分为三个部分： a) empty string ， b) every possible string starting with b ， c) every possible string starting with a 。 If we can obtain these, it will mean that X=(a+b)*.如果我们能得到这些，就意味着 X=(a+b)*。
If you take the first part (aa*b)* as e and look at the second part, you will see that we can obtain a) empty string and b) every possible string starting with b .如果您将第一部分 (aa*b)* 作为 e 并查看第二部分，您将看到我们可以获得a) empty string和b) every possible string starting with b 。 We only need to obtain c) every possible string starting with a .我们只需要获得c) every possible string starting with a 。
This part was a struggle but I think I obtained it.这部分是一场斗争，但我想我得到了它。 Now we look only for strings starting with a.现在我们只查找以 a 开头的字符串。 First of all, we have aa* in the second part obtaining all 'a's.首先，我们在第二部分有 aa* 来获得所有的 'a'。 Moreover, if we have 2 or more consecutive 'b's, the rest of the string will be accepted by the second part b(a+b)*.此外，如果我们有 2 个或更多连续的 'b'，则字符串的 rest 将被第二部分 b(a+b)* 接受。 So the only concern is single 'b's.所以唯一需要担心的是单个'b'。 Finally, single 'b's are obtained by the first part.最后，由第一部分获得单个'b'。

Although it was complicated (and probably unnecessary), this is a proof that X=(a+b)*.尽管它很复杂（并且可能没有必要），但这是 X=(a+b)* 的证明。 As a result, your solution was correct.结果，您的解决方案是正确的。