[英]Properly catching an unclosed string in ANTLR4
I have to define string literal in ANTLR4 and catch UNCLOSE_STRING exceptions.我必须在 ANTLR4 中定义字符串文字并捕获 UNCLOSE_STRING 异常。
Strings are surrounded by a pair of "" and have may have supported escapes:字符串由一对 "" 包围,并且可能支持转义:
\\b \\f \\r \\n \\t \\' \\\\
The only way for "
to appear inside a string is to be appended by a '
('"). "
出现在字符串中的唯一方法是附加一个'
('")。
I have tried various ways to define a string literal but they were all catched by UNCLOSE_STRING:我尝试了各种方法来定义字符串文字,但它们都被 UNCLOSE_STRING 捕获:
program: global_variable_part function_declaration_part EOF;
<!-- Shenanigans of statements ...-->
fragment Character: ~( [\b\f\r\n\t"\\] | '\'') | Escape | '\'"';
fragment Escape: '\\' ( 'b' | 'f' | 'r' | 'n' | 't' | '\'' | '\\');
fragment IllegalEscape: '\\' ~( 'b' | 'f' | 'r' | 'n' | 't' | '\'' | '\\') ;
STR_LIT: '"' Character* '"' {
content = str(self.text)
self.text = content[1:-1]
};
UNCLOSE_STRING: '"' Character* ([\b\f\r\n\t\\] | EOF) {
esc = ['\b', '\t', '\n', '\f', '\r', '\\']
content = str(self.text)
raise UncloseString(content)
};
For example "ab'"c\\\\n def"
would match but only Unclosed String: ab'"c\\n def"
was produced.例如,
"ab'"c\\\\n def"
将匹配,但只生成未Unclosed String: ab'"c\\n def"
。
This is quite close to the specification for Strings in Java.这与 Java 中的字符串规范非常接近。 Don't be afraid to "borrow" from other grammars.
不要害怕“借用”其他语法。 I slight modification to the Java Lexer rules that (I think) matches your needs would be:
我对(我认为)符合您需求的Java Lexer 规则稍作修改:
StringLiteral
: '"' StringCharacters? '"'
;
fragment
StringCharacters
: StringCharacter+
;
fragment
StringCharacter
: ~["\\\r\n]
| EscapeSequence
;
fragment
EscapeSequence
: '\\' [btnfr'\\]
: "\'"" // <-- the '" escape match
;
If you know of another language that's a closer match, you can look at how it was handled for looking for it's grammar here ( ANTLR4 Grammars )如果您知道另一种更接近匹配的语言,您可以在此处查看它是如何处理的以查找其语法( ANTLR4 Grammars )
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.