简体   繁体   中英

Bison C++ multiple error recovery with missing semi colon

I'm developing my own compiler, and I have a problem with error recovery design in panic mode for java grammar.

I thought about multiple solutions, but the real question:

How I could do that with bison c++?

I did that:

package 2

import java.lang.*;

error must consume to first semi colon and this run correctly with rule

package_rule: PACKAGE error ';'

but if I wrote this code:

package 2

import java.lang.*

class y { void method() { int m }

}

what I need from parser like a standard compiler, to report errors:

identifier expected at line of package. missing ';' report one for package at import instruction line. mssing ';' at int m line.

I mean I need after package error to consume token until the first semicolon or stop when find class or interface declaration at last line before declare them ! and report any other errors found after line:

int m // missing ';'

please please help me, in my mind multiple solutions, but how do that with bison c++ for java grammar?

Well, your basic problem is how do you want it to try to recover from syntax errors. When you have an input sequence like

package x import

do you want it to assume there should have been a semicolon there, or do you want it to assume something else got stuck in before the semicolon and it should throw stuff away until it gets to a semicolon?

The latter is what you have -- the rule package: PACKAGE error ';' does precisely that -- whenever it sees the keyword PACKAGE but what come after it does not match the rest of the package rule, it should throw away input until it sees a ';' and try to continue from there.

If you want the former, you would use a rule like package: PACKAGE name error -- if it sees PACKAGE with something that looks like a valid package name but no semicolon, treat it as if there was a semicolon there and try to continue.

Making it be able to do BOTH of the above things is extremely difficult. The closest would be having the grammar look something like:

package: PACKAGE name ';'  /* normal package decl */
       | PACKAGE name      /* missing semicolon -- treat this as a semantic error */
       | PACKAGE error ';' /* no name -- skip up to the next semicolon to recover */

However, this sort of thing will probably give you grammar conflicts that are hard to resolve.

You won't mind to solve that problem in the C++ OOP way, not in the bison way, would you?

Consider you have these kinds of AST node defined

struct BaseExpression {
    virtual std::string toIdentifier() = 0;
    // other member. remember to declare a virtual destructor
};

struct IntLiteral : BaseExpression {
    std::string toIdentifier() {
        error::toAnIdentifier();
        return "";
    }
};

struct Identifier: BaseExpression {
    std::string ident;

    explicit Identifier(std::string id) : ident(id) {}

    std::string toIdentifer() {
        return ident;
    }
};

Define a rule like this

%union {
    BaseExpression* expr_type;
}

%type <expr_type> simple_expr

package_expr: simple_expr
{
    $1->toIdentifer(); // thus integers or float numbers would generate errors
    // do sth with that identifer
}
;

package_expr: package_rule '.' simple_expr
{
    $3->toIdentifer(); // same trick
    // do sth with that identifer
}
;

where simple_expr is

simple_expr: int_literal { return new IntLiteral; }
           | ...
           | identifier { return new Identifier(yytext); }
;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM