I'm trying to match a duration string, like for 30 minutes
or for 2 hours
using the following rules:
durationPhrase: FOR_STR (MINUTE_DURATION | HOUR_DURATION);
MINUTE_DURATION: NONZERO_NUMBER MINUTE_STR;
HOUR_DURATION: NONZERO_NUMBER HOUR_STR;
MINUTE_STR: 'minute'('s')?;
HOUR_STR: 'hour'('s')?;
FOR_STR: 'for';
NONZERO_NUMBER: [0-9]+;
WS: (' '|[\n\t\r]) -> skip;
With the following input:
for 30 minutes
Attempting to debug/match the durationPhrase
rule, I'm presented with the error:
line 1:4 mismatched input '30' expecting {MINUTE_DURATION, HOUR_DURATION}
But I can't seem to figure out what lexer rule the '30' is matching? I was under the impression the "longest" lexer rule would win, which would be the MINUTE_DURATION
rule.
Is it instead matching NONZERO_NUMBER
first? And if so, why?
It's matching NONZERO_NUMBER
because neither of the other patterns apply. If you had entered 30minutes
, it would have matched MINUTE_DURATION
, but as a token pattern, MINUTE_DURATION
won't match the space character.
You ignore whitespace by applying -> skip
to the token WS
. That can only happen after WS
is recognised as a token; ie after tokenisation. During tokenisation, whitespace characters are just characters.
If you make MINUTE_DURATION
and HOUR_DURATION
syntax rules rather than lexical rules, it should work as expected.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.