简体   繁体   English

在手写的解析器中翻译语法文件

[英]Translating a grammar file in a hand written parser

I've been trying to write my own compiler for educational purposes and I'm stuck on an issue. 我一直在尝试编写自己的编译器用于教育目的而且我遇到了问题。 I've taken the recursive descent approach with some previous knowledge on lex and yacc/bison. 我采用递归下降方法,先前有一些关于lex和yacc / bison的知识。

So far I'm just trying to handle the parsing aspect without regards to the generation of the AST or code generation. 到目前为止,我只是试图处理解析方面而不考虑AST的生成或代码生成。

I'm trying to write the expression parsing for this particular grammar file part 我正在尝试为这个特定的语法文件部分编写表达式解析

primary_expression
    : IDENTIFIER
    | CONSTANT
    | STRING_LITERAL
    | '(' expression ')'
    ;

postfix_expression
    : primary_expression
    | postfix_expression '[' expression ']'
    | postfix_expression '(' ')'
    | postfix_expression '(' argument_expression_list ')'
    | postfix_expression '.' IDENTIFIER
    | postfix_expression PTR_OP IDENTIFIER
    | postfix_expression INC_OP
    | postfix_expression DEC_OP
    ;

So far I have this code 到目前为止,我有这个代码

void Parser::primaryExpression()
{
    if (accept(Token::eIdentifier))
    {

    }
    else if (accept(Token::eIntNumber))
    {

    }
    else if (accept('('))
    {
        expression();
        expect(')');
    }
}
void Parser::postfixExpression()
{

}

I'm having some problems dealing with the recursiveness of the postfix_expression and I don't know how to continue with postfixExpression function. 我在处理postfix_expression的递归方面遇到了一些问题,我不知道如何继续使用postfixExpression函数。

I'm under the impression that for a recursive descent parser, I should probably arrange my grammar in a different way. 我的印象是,对于递归下降解析器,我应该以不同的方式安排我的语法。

Could anyone point me in the right direction? 有人能指出我正确的方向吗?

Note that postfix_expression always parses primary_expression first, so the first order of business is primaryExpression() . 请注意, postfix_expression始终首先解析primary_expression ,因此第一个业务顺序是primaryExpression()

Then, if the next character is any of the characters that follow the recursive postfix_expression in the remaining seven rules, then you are parsing the postfix_expression . 然后,如果下一个字符是其余七个规则中的递归postfix_expression之后的任何字符,那么您正在解析postfix_expression Which gets you another posfix_expression , so you repeat again. posfix_expression你另一个posfix_expression ,所以你再次重复。

I won't write the C++ code for you, but in pseudocode: 我不会为你编写C ++代码,但是在伪代码中:

postfixExpression()
{
    primaryExpression();
    while (next character is any of the characters that follow
           postfix_expression in the remaining seven rules)
    {
         parse_the_appropriate_rule();
    }
}

Left-recursion is difficult to handle in an LL (recursive descent) parser -- you need to recognize at and change it into a loop rather than a recursive call. 在LL(递归下降)解析器中难以处理左递归 - 您需要识别并将其更改为循环而不是递归调用。 In general terms, you want to refactor the left-recursion into 一般而言,您希望将左递归重构为

A → α | A→α| A β

and then your recusive descent routine becomes 然后你的背诵下降程序就变成了

parseA() {
    parseAlpha();
    while (lookaheadMatchesBeta())
        parseBeta();
}

Note that this requires enough lookahead to distinguish between FIRST(β) and FOLLOW(A), in order to find the end of all the trailing things that can match β 请注意,这需要足够的先行来区分FIRST(β)和FOLLOW(A),以便找到所有可以匹配β的尾随事物的结尾

This is the same as the process for eliminating left recursion in an LL grammar -- you are effectively replacing the rule above with 这与在LL语法中消除左递归的过程相同 - 您实际上正在替换上面的规则

A → α A' A→αA'
A'→ ε | A'→ε| β A' βA'

and then replacing the tail-recursive call in parseAPrime with a loop and inlining it into parseA . 然后用循环替换parseAPrime的tail-recursive调用并将其内联到parseA

Doing that with your grammar and using the accept/expect technique your code above uses, you get something like: 使用您的语法并使用上面的代码使用的accept / expect技术,您会得到类似的结果:

void Parser::postfixExpression() {
    primaryExpression();
    while (true) {
        if (accept('[')) {
            expression();
            expect(']');
        } else if (accept('(')) {
            if (accept(')')) {
            } else {
                argumentExpressionList();
                expect(')'); }
        } else if (accept('.')) {

        } else if (accept(Token::DEC_OP)) {
        } else {
            break;
        }
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM