简体   繁体   English

使用Grappa(Java PEG Parser)匹配OR表达式

[英]Matching OR expression using Grappa (Java PEG Parser)

I'm new to PEG parsing and trying to write a simple parser to parse out an expression like: "term1 OR term2 anotherterm" ideally into an AST that would look something like: 我是PEG解析的新手,它试图编写一个简单的解析器来解析类似“ term1 OR term2 anotherterm”的表达式,理想情况下是将AST转换为如下所示的AST:

          OR
-----------|---------
|                    |
"term1"            "term2 anotherterm"

I'm currently using Grappa ( https://github.com/fge/grappa ) but it's not matching even the more basic expression "term1 OR term2". 我目前正在使用Grappa( https://github.com/fge/grappa ),但它甚至不匹配更基本的表达式“ term1 OR term2”。 This is what I have: 这就是我所拥有的:

package grappa;

import com.github.fge.grappa.annotations.Label;
import com.github.fge.grappa.parsers.BaseParser;
import com.github.fge.grappa.rules.Rule;

public class ExprParser extends BaseParser<Object> {

  @Label("expr")
  Rule expr() {
    return sequence(terms(), wsp(), string("OR"), wsp(), terms(), push(match()));
  }

  @Label("terms")
  Rule terms() {
    return sequence(whiteSpaces(),
        join(term()).using(wsp()).min(0),
        whiteSpaces());
  }

  @Label("term")
  Rule term() {
    return sequence(oneOrMore(character()), push(match()));
  }

  Rule character() {
    return anyOf(
        "0123456789" +
        "abcdefghijklmnopqrstuvwxyz" +
        "ABCDEFGHIJKLMNOPQRSTUVWXYZ" +
        "-_");
  }

  @Label("whiteSpaces")
  Rule whiteSpaces() {
    return join(zeroOrMore(wsp())).using(sequence(optional(cr()), lf())).min(0);
  }

}

Can anyone point me in the right direction? 谁能指出我正确的方向?

(author of grappa here...) (格拉帕的作者在这里...)

OK, so, what you seem to want is in fact a parse tree. 好的,因此,您似乎想要的实际上是一棵解析树。

Very recently there has been an extension to grappa (2.0.x+) developed which can answer your needs: https://github.com/ChrisBrenton/grappa-parsetree . 最近,对grappa(2.0.x +)进行了扩展,可以满足您的需求: https : //github.com/ChrisBrenton/grappa-parsetree

Grappa, by default, only "blindly" matches text and has a stack at its disposal, so you could have, for instance: 缺省情况下,Grappa仅“盲目”匹配文本并具有可使用的堆栈,因此您可以使用例如:

public Rule oneOrOneOrEtc()
{
    return join(one(), push(match())).using(or()).min(1));
}

But then all of your matches would have been on the stack... Not very practical, but still usable in some situations (see, for instance, sonar-sslr-grappa ). 但是,那么所有的匹配项都已经堆在栈中了……不是很实用,但是在某些情况下仍然可以使用(例如,参见sonar-sslr-grappa )。

In your case you want this package. 在您的情况下,您需要此软件包。 You can do this with it: 您可以用它来做到这一点:

// define your root node
public final class Root
    extends ParseNode
{
    public Root(final String match, final List<ParseNode> children)
    {
        super(match, children);
    }
}

// define your parse node
public final class Alternative
    extends ParseNode
{
    public Alternative(final String match, final List<ParseNode> children)
    {
        super(match, children);
    }
}

That is the minimal implementation. 那是最小的实现。 And then your parser can look like this: 然后您的解析器可能如下所示:

@GenerateNode(Alternative.class)
public Rule alternative() // or whatever
{
    return // whatever an alternative is
}

@GenerateNode(Root.class)
public Rule root
{
    return join(alternative())
        .using(or())
        .min(1);
}

What happens here is since the root node is matched before the alternative, if, say, you have a string: 这里发生的是因为根节点在替代项之前匹配,例如,如果您有一个字符串:

a or b or c or d

then the root node will match the "whole sequence", and it will have four alternatives matching each a, b, c, and d. 则根节点将与“整个序列”匹配,并且它将有四个与a,b,c和d匹配的替代方案。

Full credits here go to Christopher Brenton for coming up with this idea in the first place! 这里的全部功劳归功于Christopher Brenton首先提出的想法!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM