简体   繁体   English

关于野牛/ YACC语法的困惑

[英]Confusion about a Bison/YACC Grammar

With the following Grammar, I get a syntax error with this sort of input: 使用以下语法,使用这种输入会出现语法错误:

ls /home > foo #Runs and works okay, but raises error token
ls /home /foo /bar /etc #works okay

I think it may have something to do with how lookahead works, but this is my first grammar and I am a bit confused about why it doesn't work this way: external_cmd GT WORD is a redirect, redirect is a command, command is a commands, so input commands NEWLINE should work. 我认为这可能与前瞻的工作方式有关,但这是我的第一个语法,我对为什么它不能这样工作感到有些困惑:external_cmd GT WORD是重定向,redirect是命令,command是命令,因此输入命令NEWLINE应该可以使用。

Top Rules of the Grammar: 语法最高规则:

input:
    error NEWLINE {
        printf("Error Triggered\n");
        yyclearin;
        yyerrok; 
        prompt(); 
    } |
    input NEWLINE {
        prompt();
    } | 
    input commands NEWLINE {
        prompt (); 
    } | 
    /* empty */
    ;   

commands: 
    command |   
    command SEMI | 
    command SEMI commands
    ;   

command:
    builtin_cmd |
    redirect |
    external_cmd { 
        execute_command($1, 0, NULL);
    }
    ;

redirect:
    external_cmd GT WORD  {
        printf("Redirecting stdout of %s to %s\n", $1->cmd, $3);
        //printf("DEBUG: GT\n");
        execute_command($1, STDOUT_FILENO, $3);
    }
    external_cmd LT WORD {
        printf("Redirecting stin of %s to %s\n", $1->cmd, $3);
        //printf("DEBUG: GT\n");
        execute_command($1, STDIN_FILENO, $3);
    }
    ;

The debug / verbose input of when the error token is raised: 引发错误令牌时的调试/详细输入:

Next token is token WORD ()
Shifting token WORD ()
Entering state 6
Reading a token: Next token is token WORD ()
Shifting token WORD ()
Entering state 24
Reading a token: Next token is token GT ()
Reducing stack by rule 22 (line 115):
   $1 = token WORD ()
-> $$ = nterm arg_list ()
Stack now 0 2 6
Entering state 26
Reducing stack by rule 19 (line 91):
   $1 = token WORD ()
   $2 = nterm arg_list ()
-> $$ = nterm external_cmd ()
Stack now 0 2
Entering state 16
Next token is token GT ()
Shifting token GT ()
Entering state 29
Reading a token: Next token is token WORD ()
Shifting token WORD ()
Entering state 33
Reducing stack by rule 11 (line 68):
Redirecting stdout of ls to foo
DEBUG: redirect mode is 1
DEBUG: Command to run is ls
DEBUG: Adding Argument /home
admin  kbrandt  tempuser
-> $$ = nterm @1 ()
Stack now 0 2 16 29 33
Entering state 34
Reading a token: Next token is token NEWLINE ()
syntax error
Error: popping nterm @1 ()
Stack now 0 2 16 29 33
Error: popping token WORD ()
Stack now 0 2 16 29
Error: popping token GT ()
Stack now 0 2 16
Error: popping nterm external_cmd ()
Stack now 0 2
Error: popping nterm input ()
Stack now 0
Shifting token error ()
Entering state 1
Next token is token NEWLINE ()
Shifting token NEWLINE ()
Entering state 3
Reducing stack by rule 1 (line 38):
   $1 = token error ()
   $2 = token NEWLINE ()
Error Triggered
-> $$ = nterm input ()
Stack now 0
Entering state 2

Update: 更新:
external_cmd is: external_cmd是:

external_cmd:
    WORD arg_list {
        $$ = malloc( sizeof(struct ext_cmd) );
        if ( $$ == NULL)
            printf("Memory Allocation Error\n");
        $$->cmd = $1;
        $$->args_pp = $2;
    } |
    WORD    {
        $$ = malloc( sizeof(struct ext_cmd) );
        if ( $$ == NULL)
            printf("Memory Allocation Error\n");
        $$->cmd = $<string>1;
        $$->args_pp = NULL;
    }

The syntax error is coming from your SECOND call to yyparse. 语法错误来自您对yyparse的第二次调用。 When you have the redirect, you grammar does a YYACCEPT, which causes the parser to return immediately without reading anything more. 进行重定向后,语法将执行YYACCEPT,这将导致解析器立即返回而无需读取更多内容。 On the second call, the first token read is a NEWLINE, which cases the error (your grammar does not allow for blank lines.) 在第二次调用中,读取的第一个令牌是一个NEWLINE,它可以防止出现错误(您的语法不允许空行。)

With no redirect, there's no YYACCEPT, so the grammar continue to run, reading the newline and returning on reaching the end of the input. 没有重定向,就没有YYACCEPT,因此语法会继续运行,读取换行符并返回到输入的末尾。

  1. You really really should use left recursion with LALR(1) parser generators. 你真的真的应该使用左递归与LALR(1)语法分析器发电机。 Right recursion requires that all elements be shifted onto the parser state stack before even a single reduction can occur. 正确的递归要求所有元素都必须转移到解析器状态堆栈上,甚至可以进行一次归约。 You can imagine what this does to error recovery. 您可以想象这对错误恢复有什么作用。

  2. What exactly is external_cmd ? external_cmd到底是什么? It kind of looks like it is being reduced early but it's hard to tell because you didn't include it. 看起来好像它在早期减少,但是很难说出来,因为您没有包括它。

  3. Why is YYACCEPT invoked after any redirection? 为什么重定向后会调用YYACCEPT If you are intending to restart the parser on each line then you shouldn't have the recursive input collector. 如果打算在每行上重新启动解析器,则不应具有递归输入收集器。 As long as you do have it, don't do a YYACCEPT. 只要您有,就不要做YYACCEPT。

找到它后,在我的重定向规则中有一个缺失的管道,因此有两个组件而不是两个组件,其中一个具有中规则操作,这不是我想要的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM