简体   繁体   中英

Java Bison and Jflex error for redeclared/undeclared variables

I am making a compiler with Jflex and Bison. Jflex does the lexical analysis. Bison does the parsing.

The lexical analysis (in al file) is perfect. Tokenizes the input, and passes the input to the.y file for Bison to parse.

I need the parser to print an error for redeclared/undeclared variables. My thought are that it would need some sort of memory to remember all the variables initialized so far, so that it can produce an error for those tokens coming in and when it sees an undeclared variable being used. For example, ''bool", "test", "=", "true", ";", and on a new line, "test2", "=", "false", ";", the parser would need some sort of memory to remember ''test" and when it parses the second line it can access that memory again and say that "test2" is undeclared, hence it would print an error.

What I'm confused about is how we can make a memory like that with bison using Java in the.y file. With C, you would use the -d flag and it would make 2 files with enum types and a header file which would keep track of the declared variables but in Java I'm not too sure if I can do the same as I can't structure the grammar in any way so that it will remember variable names.

I could make a symbol table in Java code to check for redeclared variables, but in the main() in the.y file I have

 public static void main(String args[]) throws IOException {
    EXAMPLELexer lexer = new EXAMPLLexer(System.in);
    EXAMPLE parser = new EXAMPLE(lexer);
    if(parser.parse()){
      System.out.println("VALID FROM PARSER");
    }
    else{
      System.out.println("ERROR FROM PARSER");
    }
      
    return;
  }

There is no way to get the tokens individually and pass them into another java instance or whatever.%union{} doesnt work with Java, so I dont know how this is even possible. I can't find a single piece of documentation explaining this so I would love some answers!

It's actually a lot simpler to add your own data to a Bison-generated Java parser than it is to a C parser (or even a C++ parser).

Note that Bison's Java API does not have unions, mostly because Java doesn't have unions. All semantic values are non-primitive types, so they derive from Object . If you need to, you can cast them to a more precise type, or even a primitive type.

(There is an option to define a more precise base class for semantic value types, but Object is probably a good place to start.)

The %code {... } blocks are just copied into the parser class. So you can add your own members, as well as methods to manipulate them. If you want a symbol table, just add it as a HashMap to the parser class, and then you can add whatever you like to it in your actions.

Since all the parser actions are within the parser class, they have direct access to whatever members and member functions you add to the parser. All of Bison's internal members and member functions have names starting with yy , except for the member functions documented in the manual, so you can use almost any names you want without fear of name collision.

You can also use %parse-param to add arguments to the constructor; each argument corresponds to a class member. But that's probably not necessary for this particular exercise.

Of course, you'll have to figure out what an appropriate value type for the symbol is; that depends completely on what you're trying to do with the symbols. If you only want to validate that the symbols are defined when they are used, I suppose you could get away with a HashSet , but I'm sure eventually you'll want to store some more useful information.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM