I have this grammar below for implementing an IN
operator taking a list of numbers or strings.
grammar listFilterExpr;
listFilterExpr: entityIdNumberListFilter | entityIdStringListFilter;
entityIdNumberProperty
: 'a.Id'
| 'c.Id'
| 'e.Id'
;
entityIdStringProperty
: 'f.phone'
;
listFilterExpr
: entityIdNumberListFilter
| entityIdStringListFilter
;
listOperator
: '$in:'
;
entityIdNumberListFilter
: entityIdNumberProperty listOperator numberList
;
entityIdStringListFilter
: entityIdStringProperty listOperator stringList
;
numberList: '[' ID (',' ID)* ']';
fragment ID: [1-9][0-9]*;
stringList: '[' STRING (',' STRING)* ']';
STRING
: '"'(ESC | SAFECODEPOINT)*'"'
;
fragment ESC
: '\\' (["\\/bfnrt] | UNICODE)
;
fragment SAFECODEPOINT
: ~ ["\\\u0000-\u001F]
;
If I try to parse the following input:
c.Id $in: [1,1]
Then I get the following error in the parser:
mismatched input '1' expecting ID
Please help me to correct this grammar.
Update
I found this following rule way above in the huge grammar file of my project that might be matching '1' before it gets to match to ID
:
NUMBER
: '-'? INT ('.' [0-9] +)?
;
fragment INT
: '0' | [1-9] [0-9]*
;
But, If I write my ID
rule before NUMBER
then other things fail, because they have already matched ID
which should have matched NUMBER
What should I do?
As mentioned by rici: ID
should not be a fragment
. Fragments can only be used by other lexer rules, they will never become a token on their own (and can therefor not be used in parser rules).
Just remove the fragment
keyword from it: ID: [1-9][0-9]*;
Note that you'll also have to account for spaces. You probably want to skip them:
SPACES : [ \t\r\n] -> skip;
... mismatched input '1' expecting ID ...
This looks like there's another lexer, besides ID
, that also matches the input 1
and is defined before ID
. In that case, have a look at this Q&A: ANTLR 4.5 - Mismatched Input 'x' expecting 'x'
Because you have the rules ordered like this:
NUMBER
: '-'? INT ('.' [0-9] +)?
;
fragment INT
: '0' | [1-9] [0-9]*
;
ID
: [1-9][0-9]*
;
the lexer will never create an ID
token (only NUMBER
tokens will be created). This is just how ANTLR works: in case of 2 or more lexer rules match the same amount of characters, the one defined first "wins".
In the first place I think it's odd to have an ID
rule that matches only digits, but, if that's the language you're parsing, OK. In your case, you could do something like this:
id : POS_NUMBER;
number : POS_NUMBER | NEG_NUMBER;
POS_NUMBER : INT ('.' [0-9] +)?;
NEG_NUMBER : '-' POS_NUMBER;
fragment INT
: '0' | [1-9] [0-9]*
;
and then instead of ID
, use id
in your parser rules. As well as using number
instead of the NUMBER
you're using now.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.