简体   繁体   English

如何将正则表达式转换为有限的 state 机器?

[英]How to convert regular expression to finite state machine?

Let the regular expression;让正则表达式;

r = (a*|(ab)*)b*

what is the rules for converting this expression to finite state machine?将此表达式转换为有限 state 机器的规则是什么?

The rules for converting general regular expressions can be found in literature (eg Aho et al. "Compilers: Principles, Techniques, and Tools"), but quite a lot of effort is needed to program it.转换通用正则表达式的规则可以在文献中找到(例如 Aho 等人的“编译器:原理、技术和工具”),但是需要大量的努力来编写它。 Presently many open source implementations are available for this task and other operations on finite-state machines and transducers, eg openFST, SFST, Foma, and HFST (which is a common interface for the three).目前,许多开源实现可用于此任务以及有限状态机和传感器上的其他操作,例如 openFST、SFST、Foma 和 HFST(这是三者的通用接口)。 They are available as standalone programs, as libraries and through eg Python.它们可作为独立程序、库和通过例如 Python 使用。 Below your example expression is compiled using the hfst-xfst standalone program (see http://hfst.github.io/ for more information).下面的示例表达式是使用 hfst-xfst 独立程序编译的(有关更多信息,请参见http://hfst.github.io/ )。

$ hfst-xfst
hfst[0]: regex [a*|[a b]*]b* ;
? bytes. 6 states, 10 arcs, ? paths
hfst[1]: print net
Sfs0:   b -> fs1, a -> fs2.
fs1:    b -> fs1.
fs2:    b -> fs3, a -> fs4.
fs3:    b -> fs1, a -> s5.
fs4:    b -> fs1, a -> fs4.
s5: b -> fs3.
hfst[1]: 

The given regular expression给定的正则表达式

r = (a*|(ab)*)b*

The given regular expression can be broken into parts and can be combined together again to make it easy to design the DFA Let us break the regular expression into a*, ab, (ab) , b , a+b, a+((ab) ), (a |(ab) )b给定的正则表达式可以分解为多个部分,并且可以再次组合在一起,以便于设计 DFA 让我们将正则表达式分解为 a*、ab、(ab) 、b 、a+b、a+((ab) ), (a |(ab) )b

Now a* can be made into finite automata as a*现在 a* 可以变成有限自动机a*

Now ab can be made as现在 ab 可以制作为

ab抗体

b* can be made as b* b* 可以做成b*

By joining both ab and b* into one automata we get (ab)* as (ab)*通过将 ab 和 b* 加入一个自动机,我们得到 (ab)* 作为(ab)*

Now a+b as a+b Now a+b and (ab)* can be combined by placing (ab)* in place of b in a+b then we get a+((ab) ) a+((ab)*) Now a+((ab) ) and b* can be joined using ab method and the required resultant Finite state automata is produced.现在 a+b 作为a+b现在 a+b 和 (ab)* 可以通过在 a+b 中放置 (ab)* 代替 b 来组合,然后我们得到 a+((ab) ) a+((ab)*)现在 a+((ab) ) 和 b* 可以使用 ab 方法连接,并产生所需的结果有限 state 自动机。 The resultant converted (a*|(ab) )b结果转换为 (a*|(ab) )b

The rules of conversion from Regular expression to Finite state machine are:正则表达式到有限state机器的转换规则为:

1.Divide the expression to parts to make it easy to understand and add them 2.Make the Finte state machines for those partial expressions. 1.将表达式分成几个部分,使其易于理解并添加它们 2.为这些部分表达式制作Finte state机器。 3.Join those partial expressions one by one. 3.将这些部分表达式一一连接。 4.Then we get the resultant NFA. 4.然后我们得到最终的 NFA。 5.If we want to get DFA then by using ϵ-closure method convert NFA to equivalent DFA. 5.如果我们想得到 DFA,那么通过使用 ε-closure 方法将 NFA 转换为等效的 DFA。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM