简体   繁体   中英

Invalid pointer while building an AST with Bison

I'm trying to build an AST for a simple programming language (homework). However I can't make it to work : it seems that intermediate values ($1, $2, ...) are invalid and doesn't correspond to what I return in "sub-expressions".

Here is the Bison code of my project (I think the problem is here and not in my AST functions) : I've put comments where I encounter invalid values. It's my first project using Bison so I'm not sure I'm doing things correctly.

I also use Flex but the flex code seems to work correctly.

Thanks.

%{
#include <stdio.h>

#include "node.h"
#include "print_node.h"

int yylex();
int yyerror(char * s);

CommandNode * root = NULL;
%}

%union
{
    struct ExpressionNode * expression;
    struct CommandNode    * command;
    int    number;
    char * var;
}

%type   <expression>    E T F
%type   <command>       C

%token  <number>        NUMBER
%token  <var>           VAR

%token                  AF SKIP SEQ IF THEN ELSE WHILE DO ADD SUB MUL EOL

%%

root:           C EOL      { root = $1; return 0; /************ $1 seems to be garbage ************/ }
                ;

E:              E ADD T    { $$ = newAddNode($1,$3); }
        |       E SUB T    { $$ = newSubNode($1,$3); }
        |       T          { $$ = $1;                }
        ;

T:              T MUL F    { $$ = newMulNode($1,$3); }
        |       F          { $$ = $1;                }
        ;

F:              '(' E ')'  { $$ = $2;                }
        |       NUMBER     { $$ = newNumberNode($1); }
        |       VAR        { $$ = newVarNode($1);    }
        ;

C:              SKIP                 { $$ = newSkipNode();       }
        |       VAR AF E             { $$ = newAfNode($1,$3);    }
        |       '(' C ')'            { $$ = $2;                  }
        |       IF E THEN C ELSE C   { $$ = newIfNode($2,$4,$6); }
        |       WHILE E DO C         { $$ = newWhileNode($2,$4); }
        |       C SEQ C              { $$ = newSeqNode($1,$3); /************ $1 and $3 seems to be garbage ************/ }
        ;

%%

int main()
{
    yyparse();
}

int yyerror(char * s)
{
    fprintf(stderr, "yyerror: %s\n", s);
}

Most commonly, the symptoms you describe happen because your lexer (flex code, which you don't show) returns yytext directly. Since yytext points at the scanner's internal buffer, it looks fine at that instance, but after the next token(s) are read, its value mysteriously changes. This will happen if you have a flex rule like:

[a-zA-A][a-zA-Z0-9]*    { yylval.var = yytext; return VAR; }

to fix it, you need to make a copy of yytext before returning it to your parser. Something like

[a-zA-A][a-zA-Z0-9]*    { yylval.var = strdup(yytext); return VAR; }

will do the trick, though it exposes you to memory leaks.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM