简体   繁体   English

在 antlr4 中处理可选令牌的最佳方法是什么

[英]What's the best way to handle optional tokens in antlr4

Suppose I have following input:假设我有以下输入:

Great University
Graduated in 2010
Some University
09/2009 - 06/2011
Nice University
06/2011

I want to handle years of studying.我想处理多年的学习。 My grammar looks like that:我的语法是这样的:

education:
    (section)*
    EOF
    ;

section:
    (school | years)+
   ;

degree:     WORD* DEGREE WORD* SEPARATOR;
years:      WORD* ( (YEAR_START '-')? YEAR_END) WORD* SEPARATOR;
WS          : [ \t\r]+ -> skip;
SEPARATOR   : (NEWLINE | COMMA);
COMMA       : ',';
NEWLINE     : '\n';
SCHOOL      : ('university' | 'University' | 'school' | 'School');
WORD        : [a-zA-Z'()]+;
YEAR_START  : YEAR;
YEAR_END    : YEAR;
YEAR        : (DIGIT DIGIT '/')? [1-2] DIGIT DIGIT DIGIT;
DIGIT       : [0-9];

I'm getting following errors:我收到以下错误:

line 1:17 mismatched input '\n' expecting '-'
line 6:17 mismatched input '\n' expecting '-'

How can I handle optional start year via grammar?如何通过语法处理可选的开始年份?

The lexer can assign only one token type to one pattern.词法分析器只能将一种标记类型分配给一种模式。 You expect it to assign a year pattern to three token types and to decide at runtime which one is the correct one.您希望它为三种令牌类型分配一个年份模式,并在运行时决定哪一种是正确的。 This is not how ANTLR works.这不是 ANTLR 的工作方式。

In your case all years (not only the optional one) will be captured by the first rule, ie YEAR_START .在您的情况下,所有年份(不仅是可选年份)都将被第一条规则捕获,即YEAR_START This means following tokenization这意味着遵循标记化

"Graduated in 2010" -> WORD WORD YEAR_START

The only matching rule is唯一的匹配规则是

 years:      WORD* ( (YEAR_START '-')? YEAR_END) WORD* SEPARATOR;

but the '-' is missing.但是缺少“-”。

The grammar should work if you delete the YEAR_START and YEAR_END rules and replace all occurrences by YEAR .如果您删除YEAR_STARTYEAR_END规则并用YEAR替换所有出现的内容,则语法应该有效。 Probably YEAR_START and YEAR_END have the purpose to distinguish start and end, yet for this purpose there exist labels.可能YEAR_STARTYEAR_END的目的是区分开始和结束,但为此存在标签。

If this does not work, please post your complete grammar;如果这不起作用,请发布您的完整语法; the one you posted does eg not contain a rule for DEGREE .例如,您发布的那个不包含DEGREE的规则。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM