[英]AntLR - String Recognition Error
I have an ANTLR grammar file with the string definition as below 我有一个ANTLR语法文件,其字符串定义如下
STRING
: '"' (EscapeSequence | ~('\\'|'"') )* '"' ;
fragment EscapeSequence
: '\\' .
;
But this Lexer rule ignore the escape character at the first instance of the quotes. 但是,此Lexer规则在引号的第一个实例处忽略转义字符。 The
的
id\\=\\"
id \\ = \\“
is recognized as the start of the string whereas there is a preceding escape character. 被识别为字符串的开头,而前面有转义字符。 this is happening only for the first quote.
这仅发生在第一个报价中。 All the subsequent quotes, if escaped, are recognized properly.
后面的所有引号(如果转义)都可以正确识别。
/id\\= \\"Testing\\" -- Should not be a string as both quotes are escaped
/ id \\ = \\“ Testing \\” -不应该是字符串,因为两个引号都被转义
/id\\= "Testing" -- Should be a string between the quotes, since they are not escaped/ id \\ = “ Testing” -引号之间应该是字符串,因为引号不会转义
The main problem to solve is to avoid the lexer from trying to recognize a string if the character (only the last one character) preceding a quote is an escape character. 要解决的主要问题是,如果引号前面的字符(仅最后一个字符)是转义字符,则避免词法分析器尝试识别字符串。 If there are multiple escape characters, I need to consider just one character before the starting quote.
如果有多个转义字符,我只需要在引号之前考虑一个字符。
ANTLR will automatically provide the behavior you desire in almost every situation. ANTLR将在几乎每种情况下自动提供您想要的行为。 Consider the following input:
考虑以下输入:
/id\=\"Testing\"
The critical requirement involves the location and length of the token preceding the first quote character. 关键要求涉及第一个引号字符之前的令牌的位置和长度。 In the following block I add spaces only for illustrating conditions that occur between characters.
在以下块中,我仅添加空格以说明字符之间发生的情况。
/ i d \ = \ " T e s t i n g \ "
^
|
----------- Make sure no token can *end* here
By ensuring that the first "
character is included as part of the token which also includes the \\
character before it, you ensure that the first "
character will never be interpreted as the start of a STRING
token. 通过确保第一个
"
字符作为令牌的一部分包括在其前面,还包括\\
字符,可以确保第一个"
字符永远不会被解释为STRING
令牌的开始。
If the above condition is not met, your "
character will be treated as the start of a STRING
token. 如果不满足上述条件,则您的
"
字符将被视为STRING
令牌的开头。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.