简体   繁体   English

让 Bison 接受替代的 EOF 令牌

[英]Make Bison accept an alternative EOF token

I'm writing an ansi-C parser in C++ with flex and bison;我正在用 flex 和 bison 用 C++ 编写 ansi-C 解析器; it's pretty complex.这很复杂。

The issue I'm having is a compilation error.我遇到的问题是编译错误。 The error is below, it's because yy_terminate returns YY_NULL which is defined as (an int) 0 and yylex has the return type of yy::AnsiCParser::symbol_type .错误如下,这是因为yy_terminate返回YY_NULL定义为(一个 int) 0并且yylex的返回类型为yy::AnsiCParser::symbol_type yy_terminate(); is the automatic action for the <<EOF>> token in scanners generated by flex.是由 flex 生成的扫描器中<<EOF>>令牌的自动操作。 Obviously this causes a type issue.显然,这会导致类型问题。

My scanner doesn't produce any special token for the EOF, because EOF has no purpose in a C grammar.我的扫描器不会为 EOF 生成任何特殊标记,因为 EOF 在 C 语法中没有用途。 I could create a token-rule for the <<EOF>> but if I ignore it then the scanner hangs in an infinite loop in yylex on the YY_STATE_EOF(INITIAL) case.我可以为<<EOF>>创建一个令牌规则,但是如果我忽略它,那么扫描仪会在YY_STATE_EOF(INITIAL)情况下的yylex无限循环。

The compilation error,编译错误,

ansi-c.yy.cc: In function ‘yy::AnsiCParser::symbol_type yylex(AnsiCDriver&)’:
ansi-c.yy.cc:145:17: error: could not convert ‘0’ from ‘int’ to ‘yy::AnsiCParser::symbol_type {aka yy::AnsiCParser::basic_symbol<yy::AnsiCParser::by_type>}’
ansi-c.yy.cc:938:30: note: in expansion of macro ‘YY_NULL’
ansi-c.yy.cc:1583:2: note: in expansion of macro ‘yyterminate’

Also, Bison generates this rule for my start-rule (translation_unit) and the EOF ($end).此外,Bison 为我的开始规则 (translation_unit) 和 EOF ($end) 生成此规则。

$accept: translation_unit $end

So yylex has to return something for the EOF or the parser will never stop waiting for input, but my grammar cannot support an EOF token.所以yylex必须为 EOF 返回一些东西,否则解析器永远不会停止等待输入,但我的语法不能支持 EOF 标记。 Is there a way to make Bison recognize something other then 0 for the $end condition without modifying my grammar?有没有办法让 Bison 在不修改我的语法的情况下识别$end条件的0东西?

Alternatively, is there simply something I can return from the <<EOF>> token in the scanner to satisfy the Bison $end condition?或者,是否可以从扫描仪中的<<EOF>>令牌返回一些内容以满足 Bison $end条件?

Normally, you would not include an explicit EOF rule in a lexical analyzer, not because it serves no purpose, but rather because the default is precisely what you want to do.通常,您不会在词法分析器中包含显式 EOF 规则,不是因为它没有任何用途,而是因为默认值正是您想要做的。 (The purpose it serves is to indicate that the input is complete; otherwise, the parser would accept the valid prefix of certain invalid programs.) (它的作用是表明输入是完整的;否则,解析器将接受某些无效程序的有效前缀。)

Unfortunately, the C++ interfaces can defeat the simple convenience of the default EOF action, which is to return 0 (or NULL).不幸的是,C++ 接口可能会破坏默认 EOF 操作的简单便利性,即返回 0(或 NULL)。 I assume from your problem description that you have asked bison to generate a parser using complete symbols .我从你的问题描述中假设你已经要求 bison 使用完整的符号生成解析器。 In that case, you cannot simply return a 0 from yylex since the parser is expecting a complete symbol, which is a more complex type than int (Although the token which reports EOF does not normally have a semantic value, it does have a location, if you are using locaitons.) For other token types, bison will have automatically generated a function which makes an token, named something like make_FOO_TOKEN , which you will call in your scanner action for a FOO_TOKEN .在这种情况下,您不能简单地从yylex返回 0,因为解析器需要一个完整的符号,这是一个比int更复杂的类型(尽管报告 EOF 的标记通常没有语义值,但它确实有一个位置,如果您正在使用位置。)对于其他令牌类型,bison 将自动生成一个函数,该函数生成一个令牌,命名为make_FOO_TOKEN ,您将在扫描仪操作中调用FOO_TOKEN

While the C bison parser does automatically define the end of file token (called END ), it appears that the C++ interface does not.虽然 C bison 解析器会自动定义文件结尾标记(称为END ),但 C++ 接口似乎没有。 So you need to manually define it in your %token declaration in your bison input file:所以你需要在你的野牛输入文件的%token声明中手动定义它:

%token END 0 "end of file"

(That defines the token type END with an integer value of 0 and the human readable label "end of file". The value 0 is obligatory.) (这定义了带有整数值 0 和人类可读标签“文件结尾”的令牌类型END 。值 0 是强制性的。)

Once you've done that, you can add an explicit EOF rule in your flex input file:完成后,您可以在 flex 输入文件中添加显式 EOF 规则:

<<EOF>> return make_END();

If you are using locations, you'll have to give make_END a location argument as well.如果您使用位置,则还必须给make_END一个位置参数。

Here's another way to prevent the compiler error could not convert 0 from int to ...symbol_type - place this redefinition of the yyterminate macro just below where you redefine YY_DECL这是防止编译器错误的另一种方法could not convert 0 from int to ...symbol_type - 将yyterminate宏的重新定义YY_DECL重新定义YY_DECL

// change curLocation to the name of the location object used in yylex
// qualify symbol_type with the bison namespace used
#define yyterminate() return symbol_type(YY_NULL, curLocation)

The compiler error shows up when bison locations are enabled, eg with %define locations - this makes bison add a location parameter to its symbol_type constructors so the constructor without locations启用野牛位置时会出现编译器错误,例如使用%define locations - 这使得野牛向其symbol_type构造函数添加location参数,因此没有位置的构造函数

symbol_type(int tok)

turns into this with locations变成这个位置

symbol_type(int tok, location_type l)

rendering it no longer possible to convert an int to a symbol_type which is what the default definition of yyterminate in flex is able to do when bison locations are not enabled渲染不再可能将int转换为symbol_type ,这是未启用野牛位置时 flex 中yyterminate的默认定义能够执行的操作

#define yyterminate() return YY_NULL

With this workaround there's no need to handle EOF in flex if you don't need to - there's no need for a superfluous END token in bison if you don't need it使用此解决方法,如果您不需要,则无需在 flex 中处理EOF如果您不需要,则不需要在野牛中使用多余的END令牌

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM