使用Yacc和Lex进行词法和语法分析

Question

I am fairly new to Yacc and Lex programming but I am training myself with a analyser for C programs. 我对Yacc和Lex编程还很陌生，但是我正在使用C语言分析仪来训练自己。

However, I am facing a small issue that I didn't manage to solve. 但是，我面临着一个小问题，我没有解决。

When there is a declaration for example like int a,b; 当有一个声明，例如int a,b; I want to save a and b in an simple array. 我想将a和b保存在一个简单的数组中。 I did manage to do that but it saving a bit more that wanted. 我确实做到了，但是节省了更多。 It is actually saving "a," or "b;" 实际上是在保存“ a”或“ b;”。 instead of "a" and "b". 而不是“ a”和“ b”。

It should have worked as $1 should only return tID which is a regular expression recognising only a string chain. $1应该只返回tID ，这是一个仅识别字符串链的正则表达式，它应该起作用。 I don't understand why it take the coma even though it defined as a token. 我不明白为什么即使将其定义为令牌也需要昏迷。 Does anyone know how to solve this problem ? 有谁知道如何解决这个问题？

Here is the corresponding yacc declarations : 这是相应的yacc声明：

Declaration :
    tINT Decl1 DeclN
        {printf("Declaration %s\n", $2);}
    | tCONST Decl1 DeclN
        {printf("Declaration %s\n", $2);}
;

Decl1 :
    tID 
        {$$ = $1;
        tabvar[compteur].id=$1; tabvar[compteur].adresse=compteur;
        printf("Added %s at adress %d\n", $1, compteur);
        compteur++;}
    | tID tEQ E
        {$$ = $1;
        tabvar[compteur].id=$1; tabvar[compteur].adresse=compteur;
        printf("Added %s at adress %d\n", $1, compteur);
        pile[compteur]=$3;
        compteur++;}
;

DeclN :
    /*epsilon*/
    | tVIR Decl1 DeclN

And the extract of the Lex file : 并提取Lex文件：

separateur [ \t\n\r]
id [A-Za-z][A-Za-z0-9_]*
nb [0-9]+
nbdec [0-9]+\.[0-9]+
nbexp [0-9]+e[0-9]+

","                     { return tVIR; }
";"                     { return tPV; }
"="                     { return tEQ; }

{separateur}            ;
{id}                   { yylval.str = yytext; return tID; }
{nb}|{nbdec}|{nbexp}   { yylval.nb = atoi(yytext); return tNB; }


%%
int yywrap() {return 1;}

Answer 1

The problem is that yytext is a reference into lex's token scanning buffer, so it is only valid until the next time the parser calls yylex . 问题在于yytext是对lex的令牌扫描缓冲区的引用，因此它仅在解析器下次调用yylex时才有效。 You need to make a copy of the string in yytext if you want to return it. 如果要返回该字符串，则需要在yytext 复制该字符串。 Something like: 就像是：

{id}                   { yylval.str = strdup(yytext); return tID; }

will do the trick, though it also exposes you to the possibility of memory leaks. 可以解决问题，尽管它也会使您暴露出内存泄漏的可能性。

Also, in general when writing lex/yacc parsers involving single character tokens, it is much clearer to use them directly as charcter constants (eg ',' , ';' , and '=' ) rather than defining named tokens ( tVIR , tPV , and tEQ in your code). 此外，写入涉及单个字符记号法/ yacc的解析器时在一般情况下，它是更清晰直接使用它们作为字符内常数（例如',' ， ';' ，和'=' ），而不是定义命名令牌（ tVIR ， tPV ，以及代码中的tEQ ）。

使用Yacc和Lex进行词法和语法分析

问题描述

1 个解决方案

解决方案1
1 已采纳 2017-02-18 17:44:01

使用Yacc和Lex进行词法和语法分析

问题描述

1 个解决方案

解决方案1 1 已采纳 2017-02-18 17:44:01

解决方案1
1 已采纳 2017-02-18 17:44:01