简体   繁体   中英

Grammar to recognize unlimited '{' expr '}' next to each-other

I am writing a C# application using ANTLR4 to recognize the following TeX 'ish style:

{a}{x}+{b}{y}+{c}

My current grammar always takes the last instance of '{' expr '}' then ignores beginning of string. Here are some output results from current grammar (described below):

  • Input: {a} Output: a [Pass]
  • Input: {a}+{x} Output: a + x [Pass]
  • Input: {a}{x} Output: x [Fail] Desired: ax
  • Input: {a}{x}+{b} Output: x + b [Fail] Desired: ax + b
  • Input: {a}{x}+{b}{y} Output: y [Fail] Desired: ax + by
  • Input: {a}{x}+{b}{y}+{c} Output: y + c [Fail] Desired: ax + by + c
  • Input: {a}{x}+{b}{y}+{c}{d} Output: d [Fail] Desired: ax + by + cd

Any ideas on how to fix this?

Grammar MyGra.g4 file:

/*
 * Parser Rules
 */
prog: expr+ ;

expr : '{' expr '}'                 # CB_Expr
     | expr op=('+'|'-') expr       # AddSub
     | '{' ID '}'                   # CB_ID
     | ID                           # ID
     ;

/*
 * Lexer Rules
 */
ID: ('a' .. 'z' | 'A' .. 'Z')+;
ADD : '+';
SUB : '-';
WS:   (' ' | '\r' | '\n') -> channel(HIDDEN);

MyGraVisitor.CS file:

 public override string VisitID(MyGraParser.IDContext context)
 {
      return context.ID().GetText();
 }

 public override string VisitAddSub(MyGraParser.AddSubContext context)
 {
     if (context.op.Type == MyGraParser.ADD)
     {
         return Visit(context.expr(0)) + " + " + Visit(context.expr(1));
     }
     else
     {
         return Visit(context.expr(0)) + " - " + Visit(context.expr(1));
     }
 }

 public override string VisitCB_Expr(MyGraParser.CB_ExprContext context)
 {
     return Visit(context.expr());
 }

 public override string VisitCB_ID(MyGraParser.CB_IDContext context)
 {
     return context.ID().GetText();
 }

Update #1:

It was suggested to include a grammar rule for

'{' expr '}{' expr '}'

however, what if I have {a}{b}{c}{d}+{e}{f}{g} , I thought grammar was supposed to account for recursive versions of "itself" via parse trees... so what if I have 1000 {expr}'s next to each-other? How many rules do I need then? I think the suggestion is valid, except I am not sure how to account for unlimited amounts of {expr} next to each-other?

Another question I have is: How can I re-use the rule CB_Expr ?

Update #2:

I added the rule:

| expr CB_Expr                  # CB_Expr2

with visitor:

public override string VisitCB_Expr2(MyGra.CB_Expr2Context context)
{
    return Visit(context.expr()) + Visit(context.CB_Expr());
}

That did not help, I still get the same output for all cases (described above).

Your grammar is ambigous. For example: The input {x} can have two different parse trees (as Mephy said):

(CB_Expr { (expr (ID x)) })

and

(DB_ID {x})

Removing the CB_ID would fix this without actually doing anything negative.

For you actual problem, this should do the trick for expr:

expr : left=id_expr op=('+' |'-') right=expr #AddSub
     | id_expr                               #ID_Expr
     ;

id_expr :
     | '{' ID '}' id_expr                    #ID_Ex
     | '{' ID '}'                            #ID
     ;

I have not tested this though, and I have not written you any visitors, but the grammar should work.

The id_expr rule works recursively, so you should be able to put as many {ID} after each other as you want - at least one though, the way the grammar is right now.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM