简体   繁体   中英

Simple YACC grammar with problems

I want to implement a simple YACC parser, but I have problems with my grammar:

%union {
    s       string
    b       []byte
    t       *ASTNode
}

%token AND OR NOT
%token<s> STRING
%token<b> BYTES

%type<t> expr

%left AND
%left OR
%right NOT

%%

expr: '(' expr ')'              {{ $$ = $2 }}
    | expr AND expr             {{ $$ = NewASTComplexNode(OPND_AND, $1, $3) }}
    | expr AND NOT expr         {{ $$ = NewASTComplexNode(OPND_AND, $1, NewASTComplexNode(OPND_NOT, $4, nil)) }}
    | NOT expr AND expr         {{ $$ = NewASTComplexNode(OPND_AND, NewASTComplexNode(OPND_NOT, $2, nil), $4) }}
    | expr OR expr              {{ $$ = NewASTComplexNode(OPND_OR, $1, $3) }}
    | STRING                    {{ $$ = NewASTSimpleNode(OPRT_STRING, $1) }}
    | BYTES                     {{ $$ = NewASTSimpleNode(OPRT_BYTES, $1) }}
    ;

Cam someone explain me why it gives me these errors?:

rule expr:  NOT expr AND expr  never reduced
1 rules never reduced

conflicts: 3 reduce/reduce

In a comment, it is clarified that the requirement is that:

The NOT operator should apply only to operands of AND and [the operands] shouldn't be both NOT .

The second part of that requirement is a little odd, since the AND operator is defined to be left-associative. That would mean that

a AND NOT b AND NOT c

would be legal, because it is grouped as (a AND NOT b) AND NOT c , in which both AND operators have one positive term. But rotating the arguments (which might not change the semantics at all) produces:

NOT b AND NOT c AND a

which would be illegal because the first grouping ( NOT b AND NOT c ) contains two NOT expressions.

It might be that the intention was that any conjunction (sequence of AND operators) contain at least one positive term.

Both of these constraints are possible, but they cannot be achieved using operator precedence declarations.

Operator precedence can be used to resolve ambiguity in an ambiguous grammar (and expr: expr OR expr is certainly ambiguous, since it allows OR to associate in either direction). But it cannot be used to import additional requirements on operands, particularly not a requirement which takes both operands into account [Note 1]. In order to do that, you need to write out an unambiguous grammar. Fortunately, that's not too difficult.

The basis for the unambiguous grammar is what is sometimes called cascading precedence style; this effectively encodes the precedence into rules, so that precedence declarations are unnecessary. The basic boolean grammar looks like this:

expr: conjunction           /* Cascade to the next level */
    | expr OR conjunction
conjunction
    : term                  /* Continue the cascade */
    | conjunction AND term
term: atom                  /* NOT is a prefix operator */
    | NOT term              /* which is not what you want */
atom: '(' expr ')'
    | STRING

Each precedence level has a corresponding non-terminal, and the levels "cascade" in the sense that each one, except the last, includes the next level as a unit production.

To adapt this to the requirement that NOT be restricted to at most one operand of an AND operator, we can write out the possibilities more or less as you did in the original grammar, but respecting the cascading style:

expr: conjunction
    | expr OR conjunction
conjunction
    : atom                     /* NOT is integrated, so no need for 'term' */
    | conjunction AND atom
    | conjunction AND NOT atom /* Can extend with a negative */
    | NOT atom AND atom        /* Special case for negative at beginning */
    
atom: '(' expr ')'            
    | STRING

The third rule for conjunction ( conjunction: conjunction AND NOT atom ) allows any number of NOT applications to be added at the end of a list of operands, but does not allow consecutive NOT operands at the beginning of the list. A single NOT at the beginning is allowed by the fourth rule.

If you prefer the rule that a conjunction has to have at least one positive term, you can use the following very simple adaptation:

expr: conjunction
    | expr OR conjunction
conjunction
    : atom                     /* NOT is integrated, so no need for 'term' */
    | conjunction AND atom
    | conjunction AND NOT atom
    | negative AND atom        /* Possible initial list of negatives */
negative                       /* This is not part of the cascade */
    : NOT atom
    | negative AND NOT atom
atom: '(' expr ')'            
    | STRING

In this variant, negative will match, for example, NOT a AND NOT b AND NOT c . But because it's not in the cascade, that sequence doesn't automatically become a valid expression. In order for it to be used as an expression, it needs to be part of conjunction: negative AND atom , which requires that the sequence include a positive.

Notes

  1. The %nonassoc precedence declaration can be used to reject chained use of operators from the same precedence level. It's a bit of a hack, and can sometimes have unexpected consequences. The expected use case is operators like < in languages like C which don't have special handling for chained comparison; using %nonassoc you can declare that chaining is illegal in the precedence level for comparison operators.

    But %nonassoc only works within a single precedence level, and it only works if all the operators in the precedence level require the same treatment. If the intended grammar does not fully conform to those requirements, it will be necessary -- as with this grammar -- to abandon the use of precedence declarations and write out an unambiguous grammar.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM