简体   繁体   中英

Shift/Reduce Conflict in Yacc/Flex

I have this grammar in yacc:

%{
    #include <stdio.h>
%}

%token texto SEP ERRO word

%start Ini

%%

Ini: Directivas SEP SEP Conceitos '$'
            { printf("Terminou bem...\n"); return 0; };

Directivas: Directiva
          | Directivas SEP Directiva
          ;

Conceitos: Conceito
         | Conceitos SEP SEP Conceito
         ;

Conceito: word SEP Atributos;

Atributos: Atributo
         | Atributos SEP Atributo
         ;

Directiva: texto;
Atributo: '-' texto;

%%

int main(){
    yyparse();
}

int yyerror(char *s){
    fprintf(stderr, "%s\n", s);
}

And in flex:

%{
    #include "y.tab.h"
%}

%%

[a-zA-Z]+           return word;

[a-zA-Z ]+          return texto;

\-                  return '-';

\n                  return SEP;

[ \t]               ;

.                   return ERRO;

<<EOF>>             return '$';

I want to make a parse that valids something like:

text line
text line
text line

word
-text line
-text line
-text line

word
-text line

where the first lines are the 'Directivas' and then one blank line and then it comes the 'Conceitos' where one Conceito is one word followed by a few text lines with a '-' in the begin. those 'Conceitos are separated by one blank line

but it finds a shift/reduce conflitct.. i am new in this and i cant find out why

Sorry for my english

Thank you

Use yacc's (or bison's) -v option to get a full listing of the generated parser and the grammar conflicts in the y.output file. When you do this with your grammar, you get something like (from bison):

State 16 conflicts: 1 shift/reduce
        :
state 16

    6 Conceito: word SEP Atributos .
    8 Atributos: Atributos . SEP Atributo

    SEP  shift, and go to state 20

    SEP       [reduce using rule 6 (Conceito)]
    $default  reduce using rule 6 (Conceito)

This tells you exactly where the conflict is -- after reducing an Attributos and looking at a SEP lookahead, the parser doesn't know if it should shift the SEP to parse another Atributo after it, or to reduce the Conceito , which would only be valid if there's another SEP after the SEP (two token lookahead needed).

One way to avoid this would be to have your lexer return multiple SEP s (blank lines) as a single token:

\n      return SEP;
\n\n    return SEP_SEP;

You might want to allow whitespace on the blank line or more than a single blank line instead:

\n([ \t]*\n)+  return SEP_SEP;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM