[英]Lexical & Grammar Analysis using Yacc and Lex
I am fairly new to Yacc and Lex programming but I am training myself with a analyser for C programs. 我对Yacc和Lex编程还很陌生,但是我正在使用C语言分析仪来训练自己。
However, I am facing a small issue that I didn't manage to solve. 但是,我面临着一个小问题,我没有解决。
When there is a declaration for example like int a,b;
当有一个声明,例如int a,b;
I want to save a and b in an simple array. 我想将a和b保存在一个简单的数组中。 I did manage to do that but it saving a bit more that wanted. 我确实做到了,但是节省了更多。 It is actually saving "a," or "b;" 实际上是在保存“ a”或“ b;”。 instead of "a" and "b". 而不是“ a”和“ b”。
It should have worked as $1
should only return tID
which is a regular expression recognising only a string chain. $1
应该只返回tID
,这是一个仅识别字符串链的正则表达式,它应该起作用。 I don't understand why it take the coma even though it defined as a token. 我不明白为什么即使将其定义为令牌也需要昏迷。 Does anyone know how to solve this problem ? 有谁知道如何解决这个问题?
Here is the corresponding yacc declarations : 这是相应的yacc声明:
Declaration :
tINT Decl1 DeclN
{printf("Declaration %s\n", $2);}
| tCONST Decl1 DeclN
{printf("Declaration %s\n", $2);}
;
Decl1 :
tID
{$$ = $1;
tabvar[compteur].id=$1; tabvar[compteur].adresse=compteur;
printf("Added %s at adress %d\n", $1, compteur);
compteur++;}
| tID tEQ E
{$$ = $1;
tabvar[compteur].id=$1; tabvar[compteur].adresse=compteur;
printf("Added %s at adress %d\n", $1, compteur);
pile[compteur]=$3;
compteur++;}
;
DeclN :
/*epsilon*/
| tVIR Decl1 DeclN
And the extract of the Lex file : 并提取Lex文件:
separateur [ \t\n\r]
id [A-Za-z][A-Za-z0-9_]*
nb [0-9]+
nbdec [0-9]+\.[0-9]+
nbexp [0-9]+e[0-9]+
"," { return tVIR; }
";" { return tPV; }
"=" { return tEQ; }
{separateur} ;
{id} { yylval.str = yytext; return tID; }
{nb}|{nbdec}|{nbexp} { yylval.nb = atoi(yytext); return tNB; }
%%
int yywrap() {return 1;}
The problem is that yytext
is a reference into lex's token scanning buffer, so it is only valid until the next time the parser calls yylex
. 问题在于yytext
是对lex的令牌扫描缓冲区的引用,因此它仅在解析器下次调用yylex
时才有效。 You need to make a copy of the string in yytext
if you want to return it. 如果要返回该字符串,则需要在yytext
复制该字符串。 Something like: 就像是:
{id} { yylval.str = strdup(yytext); return tID; }
will do the trick, though it also exposes you to the possibility of memory leaks. 可以解决问题,尽管它也会使您暴露出内存泄漏的可能性。
Also, in general when writing lex/yacc parsers involving single character tokens, it is much clearer to use them directly as charcter constants (eg ','
, ';'
, and '='
) rather than defining named tokens ( tVIR
, tPV
, and tEQ
in your code). 此外,写入涉及单个字符记号法/ yacc的解析器时在一般情况下,它是更清晰直接使用它们作为字符内常数(例如','
, ';'
,和'='
),而不是定义命名令牌( tVIR
, tPV
,以及代码中的tEQ
)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.