[英]Regular expression to find all strings of a's and b's that contain an even number of a's and an even number of b's?
Is it possible to count the number of times that a character has occured in a string using regular expression? 是否可以使用正则表达式计算字符在字符串中出现的次数? Can any regular expression be given to find the all strings of a's and b's that contain an even number of a's and an even number of b's?
是否可以给出任何正则表达式来查找包含偶数个a和偶数个b的a和b的所有字符串?
There is a simple enough finite state machine: it has four states: s00, s01, s10, and s11 depending on whether you have consumed an even or odd number of a
s and an even or odd number of b
s. 有一个简单的有限状态机:它有四个状态:s00,s01,s10和s11,具体取决于您是否消耗
a
s的偶数或奇数以及b
的偶数或奇数。 The start state (also the end state) is the state reached by consuming an even number of both a
s and b
s. 起始状态(也就是结束状态)是通过消耗偶数
a
s和b
s达到的状态。 The transition function looks like this: 过渡函数如下所示:
d(s00, a) = s10
d(s00, b) = s01
d(s10, a) = s00
d(s10, b) = s11
d(s01, a) = s11
d(s01, b) = s00
d(s11, a) = s01
d(s11, b) = s10
We can eliminate state s11
: 我们可以消除状态
s11
:
d(s00, a) = s10
d(s00, b) = s01
d(s10, a) = s00
d(s10, ba) = s11
d(s10, bb) = s10
d(s01, b) = s00
d(s01, aa) = s01
d(s01, ab) = s10
From this we can develop a regular expression without lookahead by tracing all possible paths through the FSM that return once to the start state, and repeating: 由此,我们可以通过跟踪返回到起始状态一次的FSM的所有可能路径来开发正则表达式而无需提前查找,然后重复执行:
( a (bb|ba(aa)*ab)* (a|ba(aa)*b) | b (aa|ab(bb)*ba)* (b|ab(bb)*a) )*
(Meaningless blanks inserted to help me keep track of nesting of parentheses.) The idea is, if the first character is a
you reach s10
; (插入无意义的空格可以帮助我跟踪括号的嵌套。)这个想法是,如果第一个字符是
a
,则达到s10
; then you can transition to s10
and back to s01
repeatedly via (bb|ba(aa)*ab)*
, and finally return to s00
(without repeating s10
) either via a
or via ba(aa)*b
. 那么您可以通过
(bb|ba(aa)*ab)*
反复转换到s10
并返回到s01
,最后通过a
或通过ba(aa)*b
返回到s00
(不重复s10
)。 A similar pattern (just swap the occurrences of a
and b
) gets you from s00
back to s00
via a string that starts with b
. 类似的模式(只需交换出现的
a
和b
)就可以通过以b
开头的字符串将您从s00
返回到s00
。 And you can make as many trips out and back to s00
as you like starting with either a
or b
. 从
a
或b
开始,您可以进行任意多次往返s00
旅行。
Yes it is possible using this lookahead based regex: 是的,可以使用此基于前瞻的正则表达式:
^(?=(?:b*ab*a)*b*$)(?=(?:a*ba*b)*a*$)[ab]*$
(?=(?:b*ab*a)*b*$)
is a lookahead that makes sure there are even number of a
s in the input by matching 0 or more pairs of b*a
sub-pattern. (?=(?:b*ab*a)*b*$)
是超前的,它通过匹配0 对或更多对 b*a
子模式来确保输入中偶数a
s。
Similar check is done for even no of b
s in (?=(?:a*ba*b)*a*$)
对于
(?=(?:a*ba*b)*a*$)
的b
没有进行类似的检查
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.