[英]Segmentation fault on simple Bison script
OK, I'm doing a few experiments with Lex/Bison(Yacc), and given that my C skills are rather rusty (I've once created compilers and stuff with all these tools and now I'm lost in the first few lines... :-S), I need your help. 好的,我正在使用Lex / Bison(Yacc)做一些实验,并且鉴于我的C技能还很生锈(我曾经用所有这些工具创建过编译器和东西,现在我迷失了前几行...:-S),我需要您的帮助。
This is what my Parser looks like : 这是我的解析器的外观:
%{
#include <stdio.h>
#include <string.h>
void yyerror(const char *str)
{
fprintf(stderr,"error: %s\n",str);
}
int yywrap()
{
return 1;
}
main()
{
yyparse();
}
%}
%union
{
char* str;
}
%token <str> WHAT IS FIND GET SHOW WITH POSS OF NUMBER WORD
%type <str> statement
%start statements
%%
statement
: GET { printf("get\n"); }
| SHOW { printf("%s\n",$1); }
| OF { printf("of\n"); }
;
statements
: statement
| statements statement
;
The Issue : 问题 :
So, basically, whenever the parser comes across a "get", it prints "get". 因此,基本上,只要解析器遇到“ get”,它就会打印“ get”。 And so on. 等等。
However, when trying to print "show" (using the $1
specifier) it gives out a segmentation fault
error. 但是,当尝试打印“显示”(使用$1
指定符)时,它会发出segmentation fault
错误。
What am I doing wrong? 我究竟做错了什么?
Lex returns a number representing the token, you need to access yytext to get the text of what is parsed. Lex返回一个代表令牌的数字,您需要访问yytext以获取解析内容的文本。
something like 就像是
statement : GET { printf("get\n"); }
| SHOW { printf("%s\n",yytext); }
| OF { printf("of\n"); }
;
to propogate the text of terminals, I go ahead associate a nonterminal with a terminal and pass back the char* and start building the parse tree for example. 为了传播终端文本,我继续将非终端与终端相关联,然后传回char *并开始构建解析树。 Note I've left out the type decl and the implementation of create_sww_ASTNode(char*,char*,char*); 注意,我省略了decl类型和create_sww_ASTNode(char *,char *,char *)的实现; However, importantly not all nonterminals will return the same type, for number is an integer, word return char* sww return astNode (or whatever generic abstract syntax tree structure you come up with). 但是,重要的是,并非所有非终结符都将返回相同的类型,因为number是整数,单词return char * sww return astNode(或您想到的任何通用抽象语法树结构)。 Usually beyond the nonterminal representing terminals, it's all AST stuff. 通常在非终端代表终端之外,所有都是AST东西。
sww : show word word
{
$$ = create_sww_ASTNode($1,$2,$3);
}
;
word : WORD
{
$$ = malloc(strlen(yytext) + 1);
strcpy($$,yytext);
}
;
show : SHOW
{
$$ = malloc(strlen(yytext) + 1);
strcpy($$,yytext);
}
;
number : NUMBER
{
$$ = atoi(yytext);
}
;
You don't show your lexer code, but the problem is probably that you never set yylval
to anything, so when you access $1
in the parser, it contains garbage and you get a crash. 您没有显示您的词法分析器代码,但是问题可能是您从未将yylval
设置为任何值,因此,当您在解析器中访问$1
时,它包含垃圾并导致崩溃。 Your lexer actions need to set yylval.str
to something so it will be valid: 您的词法分析器操作需要将yylval.str
设置为某种值,这样它才有效:
"show" { yylval.str = "SHOW"; return SHOW }
[a-z]+ { yylval.str = strdup(yytext); return WORD; }
OK, so here's the answer (Can somebody tell me what it is that I always come up with the solution once I've already published a question here in SO? lol!) 好的,这就是答案(有人可以告诉我,我已经在这里发布了一个问题后,我总是想出解决方案是什么吗?)
The problem was not with the parser itself, but actually with the Lexer. 问题不在于解析器本身,而在于Lexer。
The thing is : when you tell it to { printf("%s\\n",$1); }
问题是:当您将其告诉{ printf("%s\\n",$1); }
{ printf("%s\\n",$1); }
, we actually tell it to print yylval
(which is by default an int
, not a string). { printf("%s\\n",$1); }
,实际上我们告诉它打印yylval
(默认情况下是int
,而不是字符串)。
So, the trick is to convert the appropriate tokens into strings. 因此,诀窍是将适当的标记转换为字符串。
Here's my (updated) Lexer file : 这是我的(更新的)Lexer文件:
%{
#include <stdio.h>
#include "parser.tab.h"
void toStr();
%}
DIGIT [0-9]
LETTER [a-zA-Z]
LETTER_OR_SPACE [a-zA-Z ]
%%
find { toStr(); return FIND; }
get { toStr(); return GET; }
show { toStr(); return SHOW; }
{DIGIT}+(\.{DIGIT}+)? { toStr(); return NUMBER; }
{LETTER}+ { toStr(); return WORD; }
\n /* ignore end of line */;
[ \t]+ /* ignore whitespace */;
%%
void toStr()
{
yylval.str=strdup(yytext);
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.