简体   繁体   中英

why simple grammar rule in bison not working?

I am learning flex & bison and I am stuck here and cannot figure out how such a simple grammar rule does not work as I expected, below is the lexer code:

%{

#include <stdio.h>
#include "zparser.tab.h"

%}

%%

[\t\n ]+        //ignore white space

FROM|from           { return FROM;   }
select|SELECT       { return SELECT; }
update|UPDATE       { return UPDATE; }
insert|INSERT       { return INSERT; }
delete|DELETE       { return DELETE; }
[a-zA-Z].*          { return IDENTIFIER; }
\*                  { return STAR;   }

%%

And below is the parser code:

%{
#include<stdio.h>
#include<iostream>
#include<vector>
#include<string>
using namespace std;

extern int yyerror(const char* str);
extern int yylex();


%}

%%

%token SELECT UPDATE INSERT DELETE STAR IDENTIFIER FROM;


ZQL     : SELECT STAR FROM  IDENTIFIER { cout<<"Done"<<endl; return 0;}
        ;

%%

Can any one tell me why it shows error if I try to put "select * from something"

[a-zA-Z].* will match an alphabetic character followed by any number of arbitrary characters except newline. In other words, it will match from an alphabetic character to the end of the line.

Since flex always accepts the longest match, the line select * from ... will appear to have only one token, IDENTIFIER , and that is a syntax error.

[a-zA-Z].* { return IDENTIFIER; }

The problem is here. It allows any junk to follow an initial alpha character and be returned as IDENTIFIER, including in this case the entire rest of the line after the initial ''s.

It should be:

[a-zA-Z]+          { return IDENTIFIER; }

or possibly

[a-zA-Z][a-zA-Z0-9]*          { return IDENTIFIER; }

or whatever else you want to allow to follow an initial alpha character in your identifiers.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM