简体   繁体   English

Java 重新声明/未声明变量的 Bison 和 Jflex 错误

[英]Java Bison and Jflex error for redeclared/undeclared variables

I am making a compiler with Jflex and Bison.我正在用 Jflex 和 Bison 制作一个编译器。 Jflex does the lexical analysis. Jflex 进行词法分析。 Bison does the parsing. Bison 进行解析。

The lexical analysis (in al file) is perfect.词法分析(在 al 文件中)是完美的。 Tokenizes the input, and passes the input to the.y file for Bison to parse.对输入进行标记,并将输入传递给 .y 文件供 Bison 解析。

I need the parser to print an error for redeclared/undeclared variables.我需要解析器为重新声明/未声明的变量打印错误。 My thought are that it would need some sort of memory to remember all the variables initialized so far, so that it can produce an error for those tokens coming in and when it sees an undeclared variable being used.我的想法是它需要某种类型的 memory 来记住到目前为止初始化的所有变量,以便它可以为那些进入的令牌以及当它看到使用未声明的变量时产生错误。 For example, ''bool", "test", "=", "true", ";", and on a new line, "test2", "=", "false", ";", the parser would need some sort of memory to remember ''test" and when it parses the second line it can access that memory again and say that "test2" is undeclared, hence it would print an error.例如,“bool”、“test”、“=”、“true”、“;”和新行中的“test2”、“=”、“false”、“;”,解析器将需要某种 memory 来记住“test”,当它解析第二行时,它可以再次访问该 memory 并说“test2”未声明,因此它会打印错误。

What I'm confused about is how we can make a memory like that with bison using Java in the.y file.我感到困惑的是,我们如何使用 .y 文件中的 Java 制作像野牛那样的 memory。 With C, you would use the -d flag and it would make 2 files with enum types and a header file which would keep track of the declared variables but in Java I'm not too sure if I can do the same as I can't structure the grammar in any way so that it will remember variable names.使用 C,您将使用 -d 标志,它会生成 2 个具有枚举类型的文件和一个 header 文件,该文件将跟踪声明的变量,但在 Java 中,我不太确定我是否可以做同样的事情'不要以任何方式构造语法,以便它能记住变量名。

I could make a symbol table in Java code to check for redeclared variables, but in the main() in the.y file I have我可以在 Java 代码中创建一个符号表来检查重新声明的变量,但是在 .y 文件的 main() 中我有

 public static void main(String args[]) throws IOException {
    EXAMPLELexer lexer = new EXAMPLLexer(System.in);
    EXAMPLE parser = new EXAMPLE(lexer);
    if(parser.parse()){
      System.out.println("VALID FROM PARSER");
    }
    else{
      System.out.println("ERROR FROM PARSER");
    }
      
    return;
  }

There is no way to get the tokens individually and pass them into another java instance or whatever.%union{} doesnt work with Java, so I dont know how this is even possible.没有办法单独获取令牌并将它们传递给另一个 java 实例或其他任何东西。%union{} 不适用于 Java,所以我不知道这怎么可能。 I can't find a single piece of documentation explaining this so I would love some answers!我找不到一份解释这一点的文档,所以我希望得到一些答案!

It's actually a lot simpler to add your own data to a Bison-generated Java parser than it is to a C parser (or even a C++ parser).将您自己的数据添加到 Bison 生成的 Java 解析器实际上比添加到 C 解析器(甚至 C++ 解析器)要简单得多。

Note that Bison's Java API does not have unions, mostly because Java doesn't have unions.请注意,Bison 的 Java API 没有联合,主要是因为 Java 没有联合。 All semantic values are non-primitive types, so they derive from Object .所有语义值都是非原始类型,因此它们派生自Object If you need to, you can cast them to a more precise type, or even a primitive type.如果需要,您可以将它们转换为更精确的类型,甚至是原始类型。

(There is an option to define a more precise base class for semantic value types, but Object is probably a good place to start.) (可以选择为语义值类型定义更精确的基数 class,但Object可能是一个不错的起点。)

The %code {... } blocks are just copied into the parser class. So you can add your own members, as well as methods to manipulate them. %code {... }块只是复制到解析器 class 中。因此您可以添加自己的成员以及操作它们的方法。 If you want a symbol table, just add it as a HashMap to the parser class, and then you can add whatever you like to it in your actions.如果你想要一个符号表,只需将它作为HashMap添加到解析器 class,然后你可以在你的操作中添加任何你喜欢的东西。

Since all the parser actions are within the parser class, they have direct access to whatever members and member functions you add to the parser.由于所有解析器操作都在解析器 class 中,因此它们可以直接访问您添加到解析器的任何成员和成员函数。 All of Bison's internal members and member functions have names starting with yy , except for the member functions documented in the manual, so you can use almost any names you want without fear of name collision. Bison 的所有内部成员和成员函数的名称都以yy开头,手册中记录的成员函数除外,因此您可以使用几乎任何您想要的名称而不必担心名称冲突。

You can also use %parse-param to add arguments to the constructor;您也可以使用%parse-param将 arguments 添加到构造函数中; each argument corresponds to a class member.每个参数对应一个 class 成员。 But that's probably not necessary for this particular exercise.但对于这个特定的练习,这可能不是必需的。

Of course, you'll have to figure out what an appropriate value type for the symbol is;当然,您必须弄清楚该符号的合适值类型是什么; that depends completely on what you're trying to do with the symbols.这完全取决于你想用这些符号做什么。 If you only want to validate that the symbols are defined when they are used, I suppose you could get away with a HashSet , but I'm sure eventually you'll want to store some more useful information.如果您只想验证符号在使用时是否已定义,我想您可以使用HashSet ,但我相信您最终会想要存储一些更有用的信息。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM