简体   繁体   English

如何使用antlr语法定义一个固定次数重复的模式规则

[英]how to define a rule of a pattern repeated by a fixed number of times using antlr grammar

I know '+', '?' 我知道'+','?' and '*'. 和'*'。 But what if I want something repeats itself for, say, 5 times? 但是,如果我想要一些东西重复,比如5次呢? For example, if an identifier must be a string of hexdecimal numbers of length 5? 例如,如果标识符必须是长度为5的十六进制数字的字符串?

To be more specific, I'm thinking about define a general lexer rule of unlimited length, and then, at parsing time count how many time it repeated, if it equals to 5, then rename it as another type of token, but how can I do this? 更具体地说,我正在考虑定义一个无限长度的通用词法分析器规则,然后,在解析时计算它重复的时间,如果它等于5,则将其重命名为另一种类型的令牌,但怎么能我这样做? Or is there some easy way? 还是有一些简单的方法?

at parsing time count how many time it repeated, if it equals to 5, then rename it as another type of token, but how can I do this? 在解析时计算重复的时间,如果它等于5,则将其重命名为另一种类型的令牌,但我该怎么做呢? Or is there some easy way? 还是有一些简单的方法?

Yes, you can do that with a disambiguating semantic predicate ( explanation ): 是的,您可以使用消除歧义的语义谓词解释 )来做到这一点:

grammar T;

parse
 : (short_num | long_num)+ EOF
 ;

short_num
 : {input.LT(1).getText().length() == 5}? NUM
 ;

long_num
 : {input.LT(1).getText().length() == 8}? NUM
 ;

NUM
 : '0'..'9'+
 ;

SP
 : ' ' {skip();}
 ;

which will parse the input 12345 12345678 as follows: 这将解析输入12345 12345678如下:

在此输入图像描述

But you can also change the type of the token in the lexer based on some property of the matched text, like this: 但您也可以根据匹配文本的某些属性更改词法分析器中的标记类型,如下所示:

grammar T;

parse
 : (SHORT | LONG)+ EOF
 ;

NUM
 : '0'..'9'+
   {
     if(getText().length() == 5) $type = SHORT;
     if(getText().length() == 8) $type = LONG;
     // when the length is other than 5 or 8, the type of the token will stay NUM
   }
 ;

SP
 : ' ' {skip();}
 ;

fragment SHORT : ;
fragment LONG : ;

which will cause the same input to be parsed like this: 这将导致相同的输入被解析如下:

在此输入图像描述

You need to specify it 5 times, for example: 您需要指定它5次,例如:

ZIPCODE: '0'..'9' '0'..'9' '0'..'9' '0'..'9' '0'..'9'; 

Alternatively, you can use a validating semantic predicate : 或者,您可以使用验证语义谓词

DIGIT: '0'..'9';
zipcode
@init { int N = 0; }
  :  (DIGIT { N++; } )+ { N <= 5 }?
  ;

See: What is a 'semantic predicate' in ANTLR? 请参阅: ANTLR中的“语义谓词”是什么?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM