简体   繁体   English

lex&yacc获取当前位置

[英]lex & yacc get current position

In lex & yacc there is a macro called YY_INPUT which can be redefined, for example in a such way 在lex&yacc中,有一个名为YY_INPUT的宏,可以例如以这种方式重新定义。

#define YY_INPUT(buf,result,maxlen) do { \
 const int n = gzread(gz_yyin, buf, maxlen); \                                                                              
 if (n < 0) { \
       int errNumber = 0; \
       reportError( gzerror(gz_yyin, &errNumber)); } \
     \
     result = n > 0 ? n : YY_NULL; \
  } while (0)

I have some grammar rule which called YYACCEPT macro. 我有一些语法规则,称为YYACCEPT宏。 If after YYACCEPT I called gztell (or ftell), then I got a wrong number, because parser already read some unnecessary data. 如果在YYACCEPT之后我叫gztell(或ftell),那么我得到了一个错误的数字,因为解析器已经读取了一些不必要的数据。

So how I can get current position if I have some rule which called YYACCEPT in it(one bad solution will be to read character by character) 因此,如果我有一些称为YYACCEPT的规则,该如何获得当前位置(一个不好的解决方案是逐个字符读取)

(I have already done something like this: (我已经做过这样的事情:

#define YY_USER_ACTION do { \
        current_position += yyleng; \
} while (0)   

but seems its not work ) 但似乎不起作用)

You have to keep track of the offset yourself. 您必须自己跟踪偏移量。 A simple but annoying solution is to put: 一个简单但烦人的解决方案是:

offset += yyleng;

in every flex action. 每个伸缩动作中。 Fortunately, you can do this implicitly by defining the YY_USER_ACTION macro, which is executed just before the token action. 幸运的是,您可以通过定义YY_USER_ACTION宏来隐式地执行此操作,该宏在令牌操作之前执行。

That might still not be right for your grammar, because bison often reads one token ahead. 这可能仍然不适合您的语法,因为bison经常会提前读取一个标记。 So you'll also need to attach the value of offset to each lexical token, most conveniently using the location facility ( yylloc ). 因此,您还需要将offset的值附加到每个词法标记上,最方便地使用定位工具( yylloc )。

Edit: added more details on location tracking. 编辑:添加了有关位置跟踪的更多详细信息。

The following has not been tested. 以下未经过测试。 You should read the sections in both the flex and the bison manual about location tracking. 您应该阅读flexbison手册中有关位置跟踪的部分。

The yylloc global variable and its default type are included in the generated bison code if you use the --locations command line option or the %locations directive, or if you simply refer to a location value in some rule, using the @ syntax, which is analogous to the $ syntax (that is, @n is the location value of the right-hand-side object whose semantic value is $n ). 如果您使用--locations命令行选项或%locations指令,或者如果您只是在某些规则中使用@语法引用了位置值,则yylloc全局变量及其默认类型将包含在生成的野牛代码中。与$语法类似(即@n是语义为$n的右侧对象的位置值)。 Unfortunately, the default type for yylloc uses int s, which are not wide enough to hold a file offset, although you might not be planning on parsing files for which this matters. 不幸的是, yylloc的默认类型使用int ,尽管您可能不打算解析对此很重要的文件,但它们的宽度不足以容纳文件偏移量。 In any event, it's easy enough to change; 无论如何,更改都很容易。 you merely have to #define the YYLTYPE macro at the top of your bison file. 您只需在bison文件的顶部#define YYLTYPE宏。 The default YYLTYPE is: 默认的YYLTYPE是:

typedef struct YYLTYPE
     {
       int first_line;
       int first_column;
       int last_line;
       int last_column;
     } YYLTYPE;

For a minimum modification, I'd suggest keeping the names unchanged; 对于最小的修改,我建议保持名称不变。 otherwise you'll also need to fix the YYLLOC_DEFAULT macro in your bison file. 否则,您还需要在bison文件中修复YYLLOC_DEFAULT宏。 The default YYLLOC_DEFAULT ensures that non-terminals get a location value whose first_line and first_column members come from the first element in the non-terminal's RHS, and whose last_line and last_column members come from the last element. 默认的YYLLOC_DEFAULT可确保非终端获取位置值,其first_linefirst_column成员来自非终端RHS中的第一个元素,并且last_linelast_column成员来自最后一个元素。 Since it is a macro, it will work with any assignable type for the various members, so it will be sufficient to change the column members to long , size_t or offset_t , as you feel appropriate: 由于它是一个宏,它将与各种成员的任何可分配类型一起使用,因此您可以根据需要将column成员更改为longsize_toffset_t

#define YYLTYPE yyltype;
typedef struct yyltype {
  int first_line;
  offset_t first_column;
  int last_line;
  offset_t last_column;
} yyltype;

Then in your flex input, you could define the YY_USER_ACTION macro: 然后,在flex输入中,您可以定义YY_USER_ACTION宏:

offset_t offset;
extern YYLTYPE yylloc;

#define YY_USER_ACTION         \
  offset += yyleng;            \
  yylloc.last_line = yylineno; \
  yylloc.last_column = offset;

With all that done and appropriate initialization, you should be able to use the appropriate @n.last_column in the ACCEPT rule to extract the offset of the end of the last token in the accepted input. 完成所有这些操作和适当的初始化之后,您应该能够在ACCEPT规则中使用适当的@n.last_column来提取接受的输入中最后一个标记的末尾的偏移量。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM