简体   繁体   English

ANTLR 4 - 树模式匹配

[英]ANTLR 4 - Tree pattern matching

I am trying to understand parse tree matching in ANTLR 4, so for that I have the following java code:我正在尝试了解 ANTLR 4 中的解析树匹配,因此我有以下 Java 代码:

package sampleCodes;

public class fruits {
  public static void main(String[] args){
    int a = 10;
    System.out.println(a);
  }
}

I am using ANTLR 4 to create a parse tree of this code.我正在使用 ANTLR 4 创建此代码的解析树。 Now, I want to use tree pattern matching function to find "int a = 10;".现在,我想使用树模式匹配函数来查找“int a = 10;”。 There is a doc on GitHub: https://github.com/antlr/antlr4/blob/master/doc/tree-matching.md which explains this(something like this) by an example: GitHub 上有一个文档: https : //github.com/antlr/antlr4/blob/master/doc/tree-matching.md通过一个例子解释了这个(类似这样):

ParseTree t = ...; // assume t is a statement
ParseTreePattern p = parser.compileParseTreePattern("<ID> = <expr>;", MyParser.RULE_statement);
ParseTreeMatch m = p.match(t);
if ( m.succeeded() ) {...}

From reading through this doc and few other resources, what I understood was that in:通过阅读本文档和其他一些资源,我了解到:

ParseTreePattern p = parser.compileParseTreePattern("<ID> = <expr>;", MyParser.RULE_statement);

The rule to be passed as second argument must be able to correctly parse the pattern provided as first argument.作为第二个参数传递的规则必须能够正确解析作为第一个参数提供的模式。 Now the grammar I am using is of java given here: https://github.com/antlr/grammars-v4/tree/master/java现在我使用的语法是这里给出的 java: https : //github.com/antlr/grammars-v4/tree/master/java

JavaLexer.g4, JavaParser.g4 JavaLexer.g4、JavaParser.g4

I cannot get much info on how to structure your pattern string and its corresponding rule from the above GitHub doc.我无法从上述 GitHub 文档中获得有关如何构建模式字符串及其相应规则的太多信息。 So I have tried few combinations to get the match, but none of them seems to work.For example:所以我尝试了几种组合来获得匹配,但它们似乎都不起作用。例如:

ParseTreePattern p = parser.compileParseTreePattern("<variableDeclaratorId> = <variableInitializer>", parser.RULE_variableDeclarator);
ParseTreeMatch m = p.match(tree);
System.out.println(m);

This gives:这给出:

Match failed;匹配失败; found 0 labels找到 0 个标签

I know i am certainly doing something wrong in my string pattern.我知道我的字符串模式肯定做错了。 Can anyone please help me with explaining this pattern matching function, and tell what should be the correct arguments to be used in this case.任何人都可以帮我解释这个模式匹配函数,并告诉我在这种情况下应该使用什么正确的参数。 Also, it will will be really helpful to provide links to some useful resources where I can learn more about this and work on complex patterns.(I could not find it in ANTLR4 reference)此外,提供指向一些有用资源的链接将非常有帮助,我可以在这些资源中了解更多信息并处理复杂模式。(我在 ANTLR4 参考中找不到它)

A part of parse tree for this code此代码的解析树的一部分

I think what you want is described in Combining XPath and tree pattern matching .我认为你想要的在Combining XPath and tree pattern matching 中有描述。

Something like this perhaps:可能是这样的:

import org.antlr.v4.runtime.*;
import org.antlr.v4.runtime.tree.ParseTree;
import org.antlr.v4.runtime.tree.pattern.ParseTreeMatch;
import org.antlr.v4.runtime.tree.pattern.ParseTreePattern;

import java.util.List;

public class Main {

  public static void main(String[] args) {

    String source = "package sampleCodes;\n" +
            "\n" +
            "public class fruits {\n" +
            "\n" +
            "  static { int q = 42; }\n" +
            "\n" +
            "  public static void main(String[] args){\n" +
            "    int a = 10;\n" +
            "    System.out.println(a);\n" +
            "  }\n" +
            "}\n";

    JavaLexer lexer = new JavaLexer(CharStreams.fromString(source));
    JavaParser parser = new JavaParser(new CommonTokenStream(lexer));
    ParseTree tree = parser.compilationUnit();

    ParseTreePattern p = parser.compileParseTreePattern("<IDENTIFIER> = <expression>", JavaParser.RULE_variableDeclarator);
    List<ParseTreeMatch> matches = p.findAll(tree, "//variableDeclarator");

    for (ParseTreeMatch match : matches) {
      System.out.println("\nMATCH:");
      System.out.printf(" - IDENTIFIER: %s\n", match.get("IDENTIFIER").getText());
      System.out.printf(" - expression: %s\n", match.get("expression").getText());
    }
  }
}

resulting in the following output:导致以下输出:

MATCH:
 - IDENTIFIER: q
 - expression: 42

MATCH:
 - IDENTIFIER: a
 - expression: 10

regarding the grammar you used, your string pattern is correct.关于您使用的语法,您的字符串模式是正确的。

the reason match() is not finding anything, is that probably your passing the whole tree to it (ie the tree with rule compilationUnit in root) and probably you're expecting it to search the whole tree, while match() only tries to match the pattern to the given ParseTree object. match()没有找到任何东西的原因,可能是您将整棵树传递给它(即根中带有规则compilationUnit的树),并且您可能希望它搜索整棵树,而match()只尝试将模式与给定的ParseTree对象匹配。 match() does NOT try to find the given pattern in subtrees of the given ParseTree . match()不会尝试在给定ParseTree子树中找到给定模式。 for it to work, you first need to find all nodes of VariableDeclaratorContext (by overriding the enterVariableDeclarator() method in BaseListener ) and then try to match the pattern on each of them.要使其工作,您首先需要找到VariableDeclaratorContext所有节点(通过覆盖BaseListenerenterVariableDeclarator()方法),然后尝试匹配每个节点上的模式。 eg例如

import org.antlr.v4.runtime.CharStreams;
import org.antlr.v4.runtime.CommonTokenStream;
import org.antlr.v4.runtime.tree.ParseTree;
import org.antlr.v4.runtime.tree.ParseTreeWalker;
import org.antlr.v4.runtime.tree.pattern.ParseTreeMatch;
import org.antlr.v4.runtime.tree.pattern.ParseTreePattern;

public class Main {
    public static void main(String[] args) {
        String javaCode = "public class Main {\n" +
                "            public static void main() {\n" +
                "                    int i =0;\n" +
                "            }\n" +
                "}";


        JavaGLexer javaGLexer = new JavaGLexer(CharStreams.fromString(javaCode));
        CommonTokenStream tokens = new CommonTokenStream(javaGLexer);
        JavaGParser javaGParser = new JavaGParser(tokens);
        ParseTree tree = javaGParser.compilationUnit();
        ParseTreePattern p = javaGParser.compileParseTreePattern("<variableDeclaratorId> = <variableInitializer>", javaGParser.RULE_variableDeclarator);
        ParseTreeWalker walker = new ParseTreeWalker();
        IDListener idListener = new IDListener();
        walker.walk(idListener, tree);
        ParseTreeMatch match;
        for (JavaGParser.VariableDeclaratorContext ctx: idListener.getVarCTXs())
        {
            match  = p.match(ctx);
            if (match.succeeded()) {
                System.out.println("Match \n" + " - IDENTIFIER: " +
                        match.get("variableDeclaratorId").getText() +
                        "\n - INITIALIZER: " + match.get("variableInitializer").getText());
            }
        }
    }
}

IDListener extends JavaGBaseListener and overrides enterVariableDeclarator() , and puts variableDeclator nodes in a list, that is retrievable by getVarCTXs() . IDListener扩展JavaGBaseListener并覆盖enterVariableDeclarator() ,并将variableDeclator节点放在一个列表中,该列表可由getVarCTXs()检索。

Output is:输出是:

Match
 - IDENTIFIER: i
 - INITIALIZER: 0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM