简单Bison脚本上的分段错误

Question

OK, I'm doing a few experiments with Lex/Bison(Yacc), and given that my C skills are rather rusty (I've once created compilers and stuff with all these tools and now I'm lost in the first few lines... :-S), I need your help. 好的，我正在使用Lex / Bison（Yacc）做一些实验，并且鉴于我的C技能还很生锈（我曾经用所有这些工具创建过编译器和东西，现在我迷失了前几行...：-S），我需要您的帮助。

This is what my Parser looks like : 这是我的解析器的外观：

%{
#include <stdio.h>
#include <string.h>

void yyerror(const char *str)
{
    fprintf(stderr,"error: %s\n",str);
}

int yywrap()
{
    return 1;
} 

main()
{
    yyparse();
} 

%}

%union 
{
    char* str;
}

%token <str> WHAT IS FIND GET SHOW WITH POSS OF NUMBER WORD

%type <str> statement
%start statements
%%

statement
    : GET { printf("get\n"); }
    | SHOW  { printf("%s\n",$1); }
    | OF { printf("of\n"); }
    ;

statements
    : statement
    | statements statement
    ;

The Issue : 问题：

So, basically, whenever the parser comes across a "get", it prints "get". 因此，基本上，只要解析器遇到“ get”，它就会打印“ get”。 And so on. 等等。

However, when trying to print "show" (using the $1 specifier) it gives out a segmentation fault error. 但是，当尝试打印“显示”（使用$1指定符）时，它会发出segmentation fault错误。

What am I doing wrong? 我究竟做错了什么？

Answer 1

Lex returns a number representing the token, you need to access yytext to get the text of what is parsed. Lex返回一个代表令牌的数字，您需要访问yytext以获取解析内容的文本。

something like 就像是

statement               : GET { printf("get\n"); }
                        | SHOW  { printf("%s\n",yytext); }
                        | OF { printf("of\n"); }
                        ;

to propogate the text of terminals, I go ahead associate a nonterminal with a terminal and pass back the char* and start building the parse tree for example. 为了传播终端文本，我继续将非终端与终端相关联，然后传回char *并开始构建解析树。 Note I've left out the type decl and the implementation of create_sww_ASTNode(char*,char*,char*); 注意，我省略了decl类型和create_sww_ASTNode（char *，char *，char *）的实现； However, importantly not all nonterminals will return the same type, for number is an integer, word return char* sww return astNode (or whatever generic abstract syntax tree structure you come up with). 但是，重要的是，并非所有非终结符都将返回相同的类型，因为number是整数，单词return char * sww return astNode（或您想到的任何通用抽象语法树结构）。 Usually beyond the nonterminal representing terminals, it's all AST stuff. 通常在非终端代表终端之外，所有都是AST东西。

sww                     : show word word
                        {
                           $$ = create_sww_ASTNode($1,$2,$3);
                        }
                        ;

word                    : WORD
                        { 
                          $$ = malloc(strlen(yytext) + 1);
                          strcpy($$,yytext);
                        }
                        ;

show                    : SHOW
                        { 
                          $$ = malloc(strlen(yytext) + 1);
                          strcpy($$,yytext);
                        }
                        ;

number                  : NUMBER
                        { 
                           $$ = atoi(yytext);
                        }
                        ;

Answer 2

You don't show your lexer code, but the problem is probably that you never set yylval to anything, so when you access $1 in the parser, it contains garbage and you get a crash. 您没有显示您的词法分析器代码，但是问题可能是您从未将yylval设置为任何值，因此，当您在解析器中访问$1时，它包含垃圾并导致崩溃。 Your lexer actions need to set yylval.str to something so it will be valid: 您的词法分析器操作需要将yylval.str设置为某种值，这样它才有效：

"show"   { yylval.str = "SHOW"; return SHOW }
[a-z]+   { yylval.str = strdup(yytext); return WORD; }

Answer 3

OK, so here's the answer (Can somebody tell me what it is that I always come up with the solution once I've already published a question here in SO? lol!) 好的，这就是答案（有人可以告诉我，我已经在这里发布了一个问题后，我总是想出解决方案是什么吗？）

The problem was not with the parser itself, but actually with the Lexer. 问题不在于解析器本身，而在于Lexer。

The thing is : when you tell it to { printf("%s\\n",$1); } 问题是：当您将其告诉{ printf("%s\\n",$1); } { printf("%s\\n",$1); } , we actually tell it to print yylval (which is by default an int , not a string). { printf("%s\\n",$1); } ，实际上我们告诉它打印yylval （默认情况下是int ，而不是字符串）。

So, the trick is to convert the appropriate tokens into strings. 因此，诀窍是将适当的标记转换为字符串。

Here's my (updated) Lexer file : 这是我的（更新的）Lexer文件：

%{
#include <stdio.h>
#include "parser.tab.h"

void toStr();
%}

DIGIT               [0-9]
LETTER              [a-zA-Z]
LETTER_OR_SPACE     [a-zA-Z ]

%%

find    { toStr(); return FIND; }
get     { toStr(); return GET; }
show    { toStr(); return SHOW; }

{DIGIT}+(\.{DIGIT}+)?   { toStr(); return NUMBER; }
{LETTER}+               { toStr(); return WORD; }
\n                      /* ignore end of line */;
[ \t]+                  /* ignore whitespace */;
%%

void toStr()
{
    yylval.str=strdup(yytext);
}

简单Bison脚本上的分段错误

问题描述

3 个解决方案

解决方案1
1 2014-01-22 04:09:18

解决方案2
1 2014-01-22 04:32:21

解决方案3
0 已采纳 2014-01-22 04:13:00

简单Bison脚本上的分段错误

问题描述

3 个解决方案

解决方案1 1 2014-01-22 04:09:18

解决方案2 1 2014-01-22 04:32:21

解决方案3 0 已采纳 2014-01-22 04:13:00

解决方案1
1 2014-01-22 04:09:18

解决方案2
1 2014-01-22 04:32:21

解决方案3
0 已采纳 2014-01-22 04:13:00