简体   繁体   中英

Fixing Lemon parsing confilcts

I'm writing up a small parser which parses constraints, using Flex and Lemon. Lemon is reporting a couple of parsing conflicts I haven't been able to get rid of. Are there any particular tricks for getting rid of parsing conflicts in a context free grammar?

The grammar is as follows.

// Reprint of input file "Constraint_grammar.y".
// Symbols:
//   0 $          5 NE        10 PLUS        15 NOT         20 error
//   1 IFF        6 GT        11 MINUS       16 LPAREN      21 constraint
//   2 AND        7 GTE       12 TIMES       17 RPAREN      22 bool_expr 
//   3 OR         8 LT        13 DIVIDE      18 VARIABLE    23 int_expr
//   4 EQ         9 LTE       14 POWER       19 INTEGER   
constraint ::= bool_expr.
bool_expr ::= LPAREN bool_expr RPAREN.
int_expr ::= LPAREN int_expr RPAREN.
bool_expr ::= int_expr LT int_expr.
bool_expr ::= int_expr GT int_expr.
bool_expr ::= int_expr EQ int_expr.
bool_expr ::= int_expr NE int_expr.
bool_expr ::= int_expr GTE int_expr.
bool_expr ::= int_expr LTE int_expr.
bool_expr ::= bool_expr AND bool_expr.
bool_expr ::= bool_expr OR bool_expr.
bool_expr ::= bool_expr IFF bool_expr.
int_expr ::= int_expr PLUS int_expr.
int_expr ::= int_expr MINUS int_expr.
int_expr ::= int_expr TIMES int_expr.
int_expr ::= int_expr DIVIDE int_expr.
int_expr ::= int_expr POWER int_expr.
bool_expr ::= NOT bool_expr.
int_expr ::= MINUS int_expr.
int_expr ::= VARIABLE.
bool_expr ::= VARIABLE.
int_expr ::= INTEGER.
%nonassoc IFF.
%left AND.
%left OR.
%nonassoc EQ NE GT GTE LT LTE.
%left PLUS MINUS.
%left TIMES DIVIDE.
%right POWER NOT.
%nonassoc LPAREN RPAREN.

The errors are as follows.

State 28:
     (19) int_expr ::= VARIABLE *
     (20) bool_expr ::= VARIABLE *

                             $ reduce 20
                           IFF reduce 20
                           AND reduce 20
                            OR reduce 20
                            EQ reduce 19
                            NE reduce 19
                            GT reduce 19
                           GTE reduce 19
                            LT reduce 19
                           LTE reduce 19
                          PLUS reduce 19
                         MINUS reduce 19
                         TIMES reduce 19
                        DIVIDE reduce 19
                         POWER reduce 19
                        RPAREN reduce 19
                        RPAREN reduce 20  ** Parsing conflict **
State 40:
          bool_expr ::= bool_expr * AND bool_expr
          bool_expr ::= bool_expr * OR bool_expr
          bool_expr ::= bool_expr * IFF bool_expr
     (11) bool_expr ::= bool_expr IFF bool_expr *

                             $ reduce 11
                           IFF shift  4
                           IFF reduce 11  ** Parsing conflict **
                           AND shift  1
                            OR shift  3
                        RPAREN reduce 11

The whole parser generator report is over here. http://pastebin.com/TRsV3WvK

Anyone know what I'm doing wrong here? Can I ignore these conflicts?

I would expect to fix the 'State 28' conflict by distinguishing between a boolean variable and an integer variable, using the symbol table to help determine which type of token is returned. You'd have BOOL_VARIABLE and INT_VARIABLE, perhaps. Testing shows that this change gets rid of the 'State 28' conflict.

The 'State 40' conflict is easily removed by changing the associativity of IFF from nonassoc to left . Is there a good reason not to make it associative?

You have got parser conflicts, which means that the grammar you have specified is not unambigous, ie there exists more than one parse-tree for some given input of terminal symbols. This is quite common, but if we want an unambigous grammar we need to specify disambiguation rules such as associativity and precedence such that we always can select just one of the parse trees.

Im not sure what kind of constraints you are parsing the grammar with, but im pretty sure you want an umabigous grammar here, programming languages are (almost) always unambigous. (However if the constraints are from some sort of natural language source then you will probably have to use a more suitable parser) Im not sure what the parser lemon will do if you give it an ambigous grammar, probably just prefer the one of the transitions in its automaton, which very well could lead to the tree you do not want.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM