简体   繁体   中英

Cocol/R predictive parser implementation Java

I´m working on a predicitive parser for COCOL/R

Here´s the part of the grammar that I´m trying to parse right now:

在此处输入图片说明

The squared bracket mean 1 or none The curly braces mean 0 or more.

public void Cocol( ){
    lookahead = in.next();
    if(lookahead.equals("COMPILER")){
        match("COMPILER");
        match("ident");
        ScannerSpecification();
        match("END");
        match("ident");
        match(".");
    }
    else{
        System.out.println("SYNTAX ERROR: \nCOMPILER expected, Found: "+lookahead);
        try {
            ArrayList<Integer> line = findLineOf(lookahead);
            System.out.println("Line: "+line);
        } catch (Exception ex) {
            Logger.getLogger(ScannerCocol.class.getName()).log(Level.SEVERE, null, ex);
        }
    }

}

public void ScannerSpecification(){
    // 1 o mas veces “CHARACTERS” { SetDecl }
    if(lookahead.equals("CHARACTERS")){
        match("CHARACTERS");
        // 0 or More SETDecl

    }
    if (lookahead.equals("KEYWORDS")){
        //PENDING....     
    }

    if( WhiteSpaceDecl()){
          //PENDING....
    }
    boolean res=match(".");
    if(res==false){
        System.out.println("SYNTAX ERROR: \n \".\" expected, Found: "+lookahead);

        //Encontrando linea
        try {
            ArrayList<Integer> line = findLineOf(lookahead);
            System.out.println("Line: "+line);
        } catch (Exception ex) {
            Logger.getLogger(ScannerCocol.class.getName()).log(Level.SEVERE, null, ex);
        }


    }

}
public boolean match(String terminal){
    boolean result;
    if(terminal.equals("number")){
        result = automataNumber.simularAFN(lookahead, automataNumber.inicial, conjuntoSimbolos);
        return result;
    }
    else if(terminal.equals("ident")){
        result = automataident.simularAFN(lookahead,automataident.inicial,conjuntoSimbolos);
        return result;
    }
    else if(terminal.equals("string")){
       result =  automataString.simularAFN(lookahead,automataString.inicial,conjuntoSimbolos);
       return result;
    }   
    else if(terminal.equals("char")){
        result = automataChar.simularAFN(lookahead,automataChar.inicial,conjuntoSimbolos);
        return result;
    } 
    else{
        if(this.lookahead.equals(terminal)){
            lookahead= in.next();
            return true;
        }
        else{
            System.out.println("Error: Se esperaba: "+terminal);
            return false;
        }
    }

The problem I´m facing is on the cases where I have to search for 0 or more derivations of a production (For instance on the SetDecl ) . I know I have to keep trying to match the productions until I can´t match one, but I don´t know how to discern when I have to report an error from when I just have to continue reading the input.

Can anyone give me some ideas ?

One simple model for a predictive parser is that each function corresponding to a production returns a boolean value (success or failure). Then a rule which includes a "zero or more" expansion can be written, for example:

if(lookahead.equals("CHARACTERS")){
    match("CHARACTERS");
    while (SetDecl()) ;
    return true;
}
/* ... */

By the way, it's actually handy to define an operator which consumes the lookahead string:

if(lookahead.matches("CHARACTERS")){
    while (SetDecl()) ;
    return true;
}
/* ... */

Where lookahead.matches(s) is basically if(lookahead.equals(s){match(s); return true;}else{return false;}

The general rule for any production function is this:

1) If the first object in the rule (token or non-terminal) cannot be matched, leave the input untouched and return false.

2) If any other object in the rule cannot be matched, emit an error and try to recover.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM