[英]Regular expressions representing BNF
I am working on a lexer.我正在研究词法分析器。 I need to identify patterns in BNF and specify them using Regular expressions.
我需要识别 BNF 中的模式并使用正则表达式指定它们。 I know there are TOKENS eg keywords, identifiers, operators etc. So far I have defined Regular expressions:
我知道有令牌,例如关键字、标识符、运算符等。到目前为止,我已经定义了正则表达式:
digit=[0-9]
integer={digit}+
letter=[a-zA-Z]
But the given BNF rule for Identifier is:但是标识符的给定 BNF 规则是:
< id > ::= < letter >
| "_"
| < id > < digit >
| < id > < letter >
Since the <id> is defined by Recursive pattern, how can I express this pattern using Regular expression.由于 <id> 是由递归模式定义的,我如何使用正则表达式来表达这种模式。 I tried this Regular expression:
id={letter}+|"_"|{id}{digit}+
for the above BNF rule but it is giving me error that Regular expression contains cycle.我尝试了这个正则表达式:
id={letter}+|"_"|{id}{digit}+
用于上述 BNF 规则,但它给了我正则表达式包含循环的错误。
Looking at the BNF we can see that an <id>
can begin with a letter or underscore.查看 BNF,我们可以看到
<id>
可以以字母或下划线开头。 We can also see that an <id>
can be followed by either a digit or letter and it is still a valid <id>
.我们还可以看到
<id>
后面可以跟一个数字或字母,它仍然是一个有效的<id>
。 This is implies that an <id>
begins with either a letter or underscore and can be followed by 0 or more digits or letters.这意味着
<id>
以字母或下划线开头,后面可以跟 0 个或多个数字或字母。 This suggests the following regular expression:这表明以下正则表达式:
id = [a-zA-Z_][0-9a-zA-Z]*
[a-zA-Z_]
Matches a letter or '_' [a-zA-Z_]
匹配字母或 '_'[0-9a-zA-Z]*
Matches a digit or letter 0 or more times. [0-9a-zA-Z]*
匹配数字或字母 0 次或更多次。 But since you already have {digit} and {letter} already defined as individual character classes, this would be using JFlex (I am not that familiar with JFlex, so I may not have the JFlex syntax exactly right):但是由于您已经将 {digit} 和 {letter} 定义为单独的字符类,这将使用 JFlex(我对 JFlex 不太熟悉,所以我可能没有完全正确的 JFlex 语法):
id = ({letter}|_)({digit}|{letter})*
This would be equivalent to the regex:这相当于正则表达式:
([a-zA-Z]|_)([0-9]|[a-zA-Z])*
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.