简体   繁体   中英

ANTLR Is takes one line of tokens as a single token

I'm new to ANTLR and I tried to write a simple parser. I used a valid rules, but when I run the TestRig (grun) with -gui argument on 'var' rule and entered this:

var myVar = 13

the debugger tolds me that: line 1:0 mismatched input 'var myVar = 13' expecting 'var'

I can't get what is wrong with it.. Here's the code:

grammar Leaf;

WS:     (' '|'\t'|'\n'|'\r')+ -> skip;

NUM:    ('0'..'9') ('0'..'9'|'.')*;
CHAR:   ('a'..'z'|'A'..'Z');

ID:     CHAR (CHAR|NUM)*;

BOOL:   ('true'|'false');

STRING: ~('\r'|'\n'|'"')+;

type:   'int'|'byte'|'float'|'double'|'decimal'|'char'|'bool'|'tuple'|'string'|'type';
value:  NUM|BOOL|('[' (value ',')+ ']')|('\'' CHAR '\'')|('"' STRING '"')|('(' (type ',')+ ')')|type;

var:    'var' ID('[]')? (':' type)? '=' (type|value)?;

Thanks for feedback!

Lexer rules in ANTLR are greedy. Because of that, the rule STRING :

STRING: ~('\r'|'\n'|'"')+;

consumes your entire input.

What you need to do is remove the double quotes from your value parser rule and include them in your lexer rule:

grammar Leaf;

var
 : 'var' ID ('[' ']')? (':' type)? '=' (type | value)?
 ;

value
 : NUM
 | BOOL
 | '[' value (',' value)* ']'
 | CHAR
 | STRING
 | '(' type (',' type)* ')'
 | type
 ;

type
 : 'int'
 | 'byte'
 | 'float'
 | 'double'
 | 'decimal'
 | 'char'
 | 'bool'
 | 'tuple'
 | 'string'
 | 'type'
 ;

WS     : (' '|'\t'|'\n'|'\r')+ -> skip;

BOOL   : ('true' | 'false');

NUM    : DIGIT+ ('.' DIGIT*)?;

STRING : '"' ~('\r'|'\n'|'"')* '"';

CHAR   : '\'' LETTER '\'';

ID     : LETTER (LETTER | DIGIT)*;

fragment LETTER : [a-zA-Z];
fragment DIGIT  : [0-9];

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM