[英]What's wrong with my simple antlr grammar?
I am trying to create a very simple antlr grammar file which should parse the following file: 我试图创建一个非常简单的antlr语法文件,它应该解析以下文件:
Report (MyReport)
Begin
End
Or without report name: 或者没有报告名称:
Report
Begin
End
And here is my grammar file: 这是我的语法文件:
grammar RL;
options {
language = Java;
}
report:
REPORT ('(' SPACE* STRING_LITERAL SPACE* ')')?
BEGIN
END
;
REPORT
: 'Report'
;
BEGIN
: 'Begin'
;
END : 'End';
NAME: LETTER (LETTER | DIGIT | '_')*;
STRING_LITERAL : NAME SPACE*;
fragment LETTER: LOWER | UPPER;
fragment LOWER: 'a'..'z';
fragment UPPER: 'A'..'Z';
fragment DIGIT: '0'..'9';
fragment SPACE: ' ' | '\t';
WHITESPACE: SPACE+ { $channel = HIDDEN; };
rule: ;
However when I debug in ANTLRWorks I always get the following error: 但是当我在ANTLRWorks中调试时,我总是会收到以下错误:
root -> report -> MismatchedTokenException(0!=0)
What's wrong in my Grammar file? 我的语法文件有什么问题?
thanks, Green 谢谢,格林
A couple of remarks: 几句话:
Java
is the default language, so you can omit language=Java;
Java
是默认语言,因此您可以省略language=Java;
; SPACE
inside a parser rule, while this SPACE
token is a fragment
: this causes the lexer never to create this token: remove it from your parser rule(s); SPACE
,而此SPACE
令牌是一个fragment
:这会导致词法分析器永远不会创建此令牌:从解析器规则中删除它; "Report "
("Report" followed by a single white-space) is being tokenized as a STRING_LITERAL
, not as a REPORT
! "Report "
(“报告”后跟一个空格)被标记为STRING_LITERAL
,而不是REPORT
! ANTLR's lexer consumes characters greedily, only when two or more rules match the same amount of characters, the rule defined first will get precedence. Try the following instead: 请尝试以下方法:
grammar RL;
report
: REPORT ('(' NAME ')')? BEGIN END
;
REPORT : 'Report';
BEGIN : 'Begin';
END : 'End';
NAME : LETTER (LETTER | DIGIT | '_')*;
fragment LETTER : LOWER | UPPER;
fragment LOWER : 'a'..'z';
fragment UPPER : 'A'..'Z';
fragment DIGIT : '0'..'9';
SPACE : (' ' | '\t' | '\r' | '\n')+ { $channel = HIDDEN; };
green wrote:
green写道:
What if I want to allow "SPACE" inside Report NAME?
如果我想在报告名称中允许“空格”怎么办?
I would still skip spaces in the lexer. 我仍然会在词法分析器中跳过空格。 Accepting spaces between names but ignoring them in other contexts will result in some clunky rules.
接受名称之间的空格但在其他上下文中忽略它们将导致一些笨重的规则。 Instead of accounting for spaces between a report's name, I would do something like this:
我没有考虑报告名称之间的空格,而是做这样的事情:
report
: REPORT ('(' report_name ')')? BEGIN END
;
report_name
: NAME+
;
resulting in the following parse tree: 导致以下解析树:
for the input: 输入:
Report (a name with spaces) Begin End
green wrote:
green写道:
so is it possible to allow me use reserved words like 'Report' in the name?
那么是否可以允许我在名称中使用“报告”等保留字?
Sure, explicitly add them in the report_name
rule: 当然,在
report_name
规则中明确添加它们:
report_name
: (NAME | REPORT | BEGIN | END)+
;
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.