简体   繁体   English

YACC中的语法规则结束

[英]End of grammar rule in YACC

I am an absolute beginner in yacc/lex and I stumble upon something that seems simple to me, but I am unable to understand. 我是yacc / lex的初学者,我偶然发现了一些对我来说似乎很简单的东西,但我无法理解。 I have the two following rules : S : E; 我有以下两个规则: S : E; and E : STR; E : STR; (and in the lexer, [az]+ is mapped to STR ). (在词法分析器中, [az]+映射到STR )。 My guess is that when I give the input "hithere" for example, the input is consumed and the parser should exit, no ? 我的猜测是,例如,当我给输入“ hithere”时,将消耗该输入,并且解析器应退出,不是吗?

The thing is, the parser is still waiting for input, so somehow S : E is not consumed (or so I guess). 问题是,解析器仍在等待输入,因此不使用S : E (或者我猜是这样)。 If I continue giving input, a syntax error is raised (which is expected). 如果我继续输入,则会引发语法错误(这是预期的)。

My question is, in which case does the parser stop asking for input ? 我的问题是,在哪种情况下解析器会停止请求输入? Maybe more precisely, why is the rule S : E; 也许更确切地说,为什么规则S : E; not satisfied for my specific example ? 对我的具体示例不满意?

I attach here my .l and my .y files : 我在这里附加我的.l和.y文件:

test1.l : test1.l

%{
#include <stdio.h>
#include <stdlib.h>
#include "y.tab.h"
%}

%option noyywrap

%%
[a-z]+                  {yylval.str = yytext; return (STR);}
.                       { ; }
%%

test1.y : test1.y

%{
#include <stdio.h>
#include <stdlib.h>
extern int yylex();
%}

%union {
    char    *str;
}

%token <str> STR
%type <str> E

%%

S : E                   {printf("%s\n", $1);}
  ;

E : STR                 {$$ = $1;}
  ;

%%

int yyerror(char *msg) {
    printf("%s\n", msg);
    return (0);
}

int main() {
    yyparse();
    return (0);
}

The thing that seems really weird to me is that if I give the input "hithere", "hithere" is printed back on my terminal, so that is a strong indicator to me that S : E; 对我来说似乎很奇怪的事情是,如果我在终端上打印回“ hithere”的输入,那么“ hithere”将被打印回我,这很明显表明S : E; actually has been recognized and printf() executed. 实际上已经被识别并执行printf()

It's waiting for more input so it can reduce the production S : E ; 它正在等待更多输入,因此可以减少产量S : E ; . You need to type ctrl/d or ctrl/z depending on your system. 您需要根据系统输入ctrl / d或ctrl / z。

Bison/yacc (and many, though not all, derivatives) actually construct an "augmented" grammar by adding a new start production which is effectively: 实际上,Bison / yacc(以及许多(尽管不是全部)派生词)通过添加一个新的初始产生式实际上构成了“增强”语法:

$start: S END

Where S is your start symbol (or the first non-terminal in the grammar if you don't specify), and END is a token representing the end of input. 其中S是您的起始符号(如果未指定,则为语法中的第一个非END符),而END是表示输入结束的标记。 (It is a real token, whose value is 0. (f)lex scanners return 0 when they get an end-of-file, so to the parser it looks like its being given an END token.) (这是一个真实的令牌,其值为0。(f)lex扫描程序在收到文件结束符时将返回0,因此对于解析器而言,它看起来像是被赋予了END令牌。)

So the parser won't return until it sees an END token, which means that the scanner has seen an end of file. 因此,解析器在看到END令牌之前不会返回,这意味着扫描程序已看到文件结尾。 If your input is coming from a terminal, you need to send an EOF, typically by typing the EOF character: control-D on most Unix-like systems, or control-Z on Windows/DOS. 如果您的输入来自终端,则需要发送EOF,通常是通过键入EOF字符:在大多数类Unix系统上输入control-D,在Windows / DOS上输入control-Z。

Unlike many parser generators, bison will perform a reduction without reading a lookahead symbol if the lookahead symbol is not necessary to decide that the reduction must be performed. 与许多解析器生成器不同,如果不必使用先行符号来确定必须执行缩减,则野牛将执行精简操作而不读取先行符号。 In tbe case of your grammar, that is possible with the S: E production because there is no possible shift; 在您的语法情况下, S: E产生是可能的,因为没有可能的转移。 either the reduction is correct (if the next token is END ) or the input is not syntactically valid (if the next token is anything else). 减少是正确的(如果下一个标记是END ),或者输入在语法上无效(如果下一个标记是其他任何东西)。 So the semantic value of the string is printed. 因此,将打印字符串的语义值。 For an even slightly more complicated grammar, that wouldn't happen (until the EOF is recognized). 对于一个稍微复杂一点的语法,就不会发生(直到EOF被识别)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM