简体   繁体   中英

Antlr grammar not matching expected lexer rule

I'm trying to match a duration string, like for 30 minutes or for 2 hours using the following rules:

durationPhrase: FOR_STR (MINUTE_DURATION | HOUR_DURATION);

MINUTE_DURATION: NONZERO_NUMBER MINUTE_STR;

HOUR_DURATION: NONZERO_NUMBER HOUR_STR;

MINUTE_STR: 'minute'('s')?;

HOUR_STR: 'hour'('s')?;

FOR_STR: 'for';

NONZERO_NUMBER: [0-9]+;

WS: (' '|[\n\t\r]) -> skip;

With the following input:

for 30 minutes

Attempting to debug/match the durationPhrase rule, I'm presented with the error:

line 1:4 mismatched input '30' expecting {MINUTE_DURATION, HOUR_DURATION}

But I can't seem to figure out what lexer rule the '30' is matching? I was under the impression the "longest" lexer rule would win, which would be the MINUTE_DURATION rule.

Is it instead matching NONZERO_NUMBER first? And if so, why?

It's matching NONZERO_NUMBER because neither of the other patterns apply. If you had entered 30minutes , it would have matched MINUTE_DURATION , but as a token pattern, MINUTE_DURATION won't match the space character.

You ignore whitespace by applying -> skip to the token WS . That can only happen after WS is recognised as a token; ie after tokenisation. During tokenisation, whitespace characters are just characters.

If you make MINUTE_DURATION and HOUR_DURATION syntax rules rather than lexical rules, it should work as expected.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM