简体   繁体   中英

Make Bison accept an alternative EOF token

I'm writing an ansi-C parser in C++ with flex and bison; it's pretty complex.

The issue I'm having is a compilation error. The error is below, it's because yy_terminate returns YY_NULL which is defined as (an int) 0 and yylex has the return type of yy::AnsiCParser::symbol_type . yy_terminate(); is the automatic action for the <<EOF>> token in scanners generated by flex. Obviously this causes a type issue.

My scanner doesn't produce any special token for the EOF, because EOF has no purpose in a C grammar. I could create a token-rule for the <<EOF>> but if I ignore it then the scanner hangs in an infinite loop in yylex on the YY_STATE_EOF(INITIAL) case.

The compilation error,

ansi-c.yy.cc: In function ‘yy::AnsiCParser::symbol_type yylex(AnsiCDriver&)’:
ansi-c.yy.cc:145:17: error: could not convert ‘0’ from ‘int’ to ‘yy::AnsiCParser::symbol_type {aka yy::AnsiCParser::basic_symbol<yy::AnsiCParser::by_type>}’
ansi-c.yy.cc:938:30: note: in expansion of macro ‘YY_NULL’
ansi-c.yy.cc:1583:2: note: in expansion of macro ‘yyterminate’

Also, Bison generates this rule for my start-rule (translation_unit) and the EOF ($end).

$accept: translation_unit $end

So yylex has to return something for the EOF or the parser will never stop waiting for input, but my grammar cannot support an EOF token. Is there a way to make Bison recognize something other then 0 for the $end condition without modifying my grammar?

Alternatively, is there simply something I can return from the <<EOF>> token in the scanner to satisfy the Bison $end condition?

Normally, you would not include an explicit EOF rule in a lexical analyzer, not because it serves no purpose, but rather because the default is precisely what you want to do. (The purpose it serves is to indicate that the input is complete; otherwise, the parser would accept the valid prefix of certain invalid programs.)

Unfortunately, the C++ interfaces can defeat the simple convenience of the default EOF action, which is to return 0 (or NULL). I assume from your problem description that you have asked bison to generate a parser using complete symbols . In that case, you cannot simply return a 0 from yylex since the parser is expecting a complete symbol, which is a more complex type than int (Although the token which reports EOF does not normally have a semantic value, it does have a location, if you are using locaitons.) For other token types, bison will have automatically generated a function which makes an token, named something like make_FOO_TOKEN , which you will call in your scanner action for a FOO_TOKEN .

While the C bison parser does automatically define the end of file token (called END ), it appears that the C++ interface does not. So you need to manually define it in your %token declaration in your bison input file:

%token END 0 "end of file"

(That defines the token type END with an integer value of 0 and the human readable label "end of file". The value 0 is obligatory.)

Once you've done that, you can add an explicit EOF rule in your flex input file:

<<EOF>> return make_END();

If you are using locations, you'll have to give make_END a location argument as well.

Here's another way to prevent the compiler error could not convert 0 from int to ...symbol_type - place this redefinition of the yyterminate macro just below where you redefine YY_DECL

// change curLocation to the name of the location object used in yylex
// qualify symbol_type with the bison namespace used
#define yyterminate() return symbol_type(YY_NULL, curLocation)

The compiler error shows up when bison locations are enabled, eg with %define locations - this makes bison add a location parameter to its symbol_type constructors so the constructor without locations

symbol_type(int tok)

turns into this with locations

symbol_type(int tok, location_type l)

rendering it no longer possible to convert an int to a symbol_type which is what the default definition of yyterminate in flex is able to do when bison locations are not enabled

#define yyterminate() return YY_NULL

With this workaround there's no need to handle EOF in flex if you don't need to - there's no need for a superfluous END token in bison if you don't need it

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM