简体   繁体   中英

Java CC issue - “Expansion within ”(…)*“ can be matched by empty string”

We've been given a grammar to patch up and parse using Java CC. One of the problems with it is several occurrences of "expansion within "(...)*" can be matched by empty string. I understand this error is caused when something can be matched zero or more times inside something else that can be matched zero or more times.

What I don't understand is how to fix it. (Our instructor hasn't been saying much, "You have to be careful how you word it."

The problem area of the grammar, along with its associated Java CC code is shown below. Any ideas or advice would be greatly appreciated.

program := ( decl )*
           ( function ) *
           main_prog

decl := ( var_decl | const_decl )*

var_decl := var ident_list : type ( , ident_list : type)* ;

const_decl := const identifier : type = expression ( , identifier : type = expression)* ;

function :=
            type identifier ( param_list)
            ( decl )*
            ( statement ; )*
            return ( expression | e ); //e is greek epsilon character

main_prog := 
            main
            ( decl ) *
            (statement ; )*

The issue is with the way decl is declared I think. It is declared here in actual Java CC code:

void decl():{}
{
( var_decl() | const_decl())*
}

If I change that Kleene closure above to + , all the other errors caused by this go away. However the instructor says the star should remain, and we need to be careful how we word it. I've found lots of resources on left factoring, left recursion removal and the like, but scant little on this particular issue. The above code doesn't actually have an error in Java CC, but is the cause of further ones as below:

void program():{}
{
( decl() )* //error here - Expansion within "(...)*" can be matched by empty string
( function() )*
main_prog()
}

void main_prog(): {}
{
< MAIN >
( decl() )* //same error on this line
(statement() < SCOLON >)*
}

void function(): {}
{
type() < ID > <LPARENT > param_list() < RPARENT >
( decl() )* //same error on this line
( statement() < SCOLON > )*
< RET> ( expression() | {} ) <SCOLON > // {} is epsilon
}

Any ideas on how to go about fixing this would be very much appreciated.

As it stands your grammar is ambiguous - it says that a decl means zero or more declarations, and there are a number of places where you allow zero or more decl s. You don't need * in both these places, just pick one or the other, either approach will parse the same programs but they're conceptually slightly different.

You could take out the * in decl :

decl := ( var_decl | const_decl )

program := ( decl )*
           ( function ) *
           main_prog

so decl represents a single declaration, and a program may start with a sequence of decl s but doesn't have to. Alternatively you could leave the * in decl but take it out from the places where you reference it:

decl := ( var_decl | const_decl )*

program := decl
           ( function ) *
           main_prog

so now decl represents something like a "declarations block" rather than a single declaration - every program must start with a declarations block but that block is itself allowed to be empty.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM