简体   繁体   中英

identifier token keyword antlr parser

How to handle the case where the token 'for' is used in two different situations in the language to parse? Such as statement and as a "parameter" as the following example:

echo for print example
for i in {0..10..2}
  do
     echo "Welcome $i times"
 done

Output:

for print example
Welcome 0 times
Welcome 2 times
Welcome 4 times
Welcome 6 times
Welcome 8 times
Welcome 10 times

Thanks.

The only way I see how you could go about doing this, is define an Echo rule in your lexer grammar that matches the characters echo followed by all other characters except \\r and \\n :

Echo
  :  'echo' ~('\r' | '\n')+
  ;

and make sure that rule is before the rule that matches identifiers and keywords (like for ).

A quick demo of a possible start would be:

grammar Test;

parse
  :  (echo | for)*
  ;

echo
  :  Echo (NewLine | EOF)
  ;

for 
  :  For Identifier In range NewLine
     Do NewLine
     echo
     Done (NewLine | EOF)
  ;

range
  :  '{' Integer '..' Integer ('..' Integer)? '}'
  ;

Echo
  :  'echo' ~('\r' | '\n')+
  ;

For  : 'for';
In   : 'in';
Do   : 'do';
Done : 'done';

Identifier
  :  ('a'..'z' | 'A'..'Z' | '_') ('a'..'z' | 'A'..'Z' | '_' | '0'..'9')*
  ;

Integer
  :  '0'..'9'+
  ;

NewLine
  :  '\r' '\n'
  |  '\n'
  |  '\r'
  ;

Space
  :  (' ' | '\t') {skip();}
  ;

If you'd parse the input:

echo for print example
for i in {0..10..2}
do
  echo "Welcome $i times"
done
echo the end for now!

with it, it would look like:

alt text http://img571.imageshack.us/img571/5713/grammar.png

(I had to rotate the image a bit, otherwise it wouldn't be visible at all!)

HTH.

Well, it's pretty easy, most grammars use something like this:

TOKEN_REF
    :   'A'..'Z' ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*
    ;

So when referring to a print statement you would do something like:

'print' (TOKEN_REF)*

And with a for statement you just explicity state 'for' such as:

'for' INT 'in' SOMETHING

In order to do that you need to use a semantic predicate to only take that lexer rule when it really is the for keyword.

Details are available on the keywords as identifiers page on the ANTLR wiki.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM