简体   繁体   中英

How to write a pure parser and reentrant scanner by “win_flex bison”?

I've write a parser for evaluating a logical expression. I know flex and bison use global variables (like yylval). I want a pure parser and a reentrant scanner for thread programming. My '.y' file is here:

%{
#include <stdio.h>
#include <string>
#define YYSTYPE bool

void yyerror(char *);

//int  yylex (YYSTYPE* lvalp);

int yylex(void);
bool parseExpression(const std::string& inp);
%}

%token INTEGER
%left '&' '|'

%%

program:
        program statement '\n'
        | /* NULL */
        ;

statement:
        expression                      { printf("%d\n", $1); return $1; }
        ;

expression:
        INTEGER
        | expression '|' expression     { $$ = $1 | $3; }
        | expression '&' expression     { $$ = $1 & $3; }
        | '(' expression ')'            { $$ = $2; }
        | '!' expression                { $$ = !$2; }
        ;

%%

void yyerror(char *s) {
    fprintf(stderr, "%s\n", s);
}


void main(void) {

    std::string inp = "0|0\n";

    bool nasi = parseExpression(inp);
    printf("%s%d\n", "nasi ", nasi);
    printf("Press ENTER to close. ");
    getchar();
}

My '.y' file is here:

    /* Lexer */
%{
    #include "parser.tab.h"
    #include <stdlib.h>
    #include <string>
    #define YYSTYPE bool
    void yyerror(char *);
%}


%%

[0-1]      {
                if (strcmp(yytext, "0")==0)
                {
                    yylval = false;
                    //*lvalp = false;
                }
                else
                {
                    yylval = true; 
                    //*lvalp = true;
                }

                return INTEGER;
            }

[&|!()\n]     { return *yytext; }

[ \t]   ;       /* skip whitespace */

.               yyerror("Unknown character");

%%

int yywrap(void) {
    return 1;
}

bool parseExpression(const std::string& inp)
{
    yy_delete_buffer(YY_CURRENT_BUFFER);

    /*Copy string into new buffer and Switch buffers*/
    yy_scan_string(inp.c_str());
    bool nasi = yyparse();

    return nasi;


}

I've added %pure_parser to both files, changed yylex declaration to int yylex (YYSTYPE* lvalp); and replaced yylval to *lvalp , but I saw an error: 'lvalp' is undeclared identifier. . There are many examples about 'reentrant' and 'pure', but I can't find the best guideline.

Could someone guide me?

Thanks in advance.

Fortunately, I did it. Here is my code. I think it can be a good guideline for who wants write a pure parser.ل

My reentrant scanner:

    /* Lexer */
%{
    #include "parser.tab.h"
    #include <stdlib.h>
    #include <string>
    #define YYSTYPE bool
    void yyerror (yyscan_t yyscanner, char const *msg);
%}

%option reentrant bison-bridge

%%

[0-1]      {
                if (strcmp(yytext, "0")==0)
                {
                    *yylval = false;
                }
                else
                {
                    *yylval = true;
                }

                //yylval = atoi(yytext);
                return INTEGER;
            }

[&|!()\n]     { return *yytext; }

[ \t]   ;       /* skip whitespace */

.               yyerror (yyscanner, "Unknown character");

%%

int yywrap(yyscan_t yyscanner)
{
    return 1;
}

bool parseExpression(const std::string& inp)
{
    yyscan_t myscanner;
    yylex_init(&myscanner);
    struct yyguts_t * yyg = (struct yyguts_t*)myscanner;

    yy_delete_buffer(YY_CURRENT_BUFFER,myscanner);

    /*Copy string into new buffer and Switch buffers*/
    yy_scan_string(inp.c_str(), myscanner);

    bool nasi = yyparse(myscanner);
    yylex_destroy(myscanner);
    return nasi;
}

My pure parser:

%{
    #include <stdio.h>
    #include <string>

    #define YYSTYPE bool
    typedef void* yyscan_t;
    void yyerror (yyscan_t yyscanner, char const *msg);
    int yylex(YYSTYPE *yylval_param, yyscan_t yyscanner);
    bool parseExpression(const std::string& inp);
%}


%define api.pure full
%lex-param {yyscan_t scanner}
%parse-param {yyscan_t scanner}

%token INTEGER
%left '&' '|'

%%

program:
        program statement '\n'
        | /* NULL */
        ;

statement:
        expression                      { printf("%d\n", $1); return $1; }
        ;

expression:
        INTEGER
        | expression '|' expression     { $$ = $1 | $3; }
        | expression '&' expression     { $$ = $1 & $3; }
        | '(' expression ')'            { $$ = $2; }
        | '!' expression                { $$ = !$2; }
        ;

%%

void yyerror (yyscan_t yyscanner, char const *msg){
    fprintf(stderr, "%s\n", msg);
}


void main(void) {

    std::string inp = "1|0\n";

    bool nasi = parseExpression(inp);
    printf("%s%d\n", "nasi ", nasi);
    printf("Press ENTER to close. ");
    getchar();
}

Notice that I've cheat and defined yyg myself as

struct yyguts_t * yyg = (struct yyguts_t*)yyscanner;

I don't find another way to get the YY_CURRENT_BUFFER . So, If someone knows the best way to get the YY_CURRENT_BUFFER , tell me,plz.

Here is a complete Flex/Bison C++ example. Everything is reentrant, no use of global variables. Both parser/lexer are encapsulated in a class placed in a separate namespace. You can instantiate as many "interpreters" in as many threads as you want.

https://github.com/ezaquarii/bison-flex-cpp-example

Disclaimer: it's not tested on Windows, but the code should be portable with minor tweaks.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM