简体   繁体   中英

Matching OR expression using Grappa (Java PEG Parser)

I'm new to PEG parsing and trying to write a simple parser to parse out an expression like: "term1 OR term2 anotherterm" ideally into an AST that would look something like:

          OR
-----------|---------
|                    |
"term1"            "term2 anotherterm"

I'm currently using Grappa ( https://github.com/fge/grappa ) but it's not matching even the more basic expression "term1 OR term2". This is what I have:

package grappa;

import com.github.fge.grappa.annotations.Label;
import com.github.fge.grappa.parsers.BaseParser;
import com.github.fge.grappa.rules.Rule;

public class ExprParser extends BaseParser<Object> {

  @Label("expr")
  Rule expr() {
    return sequence(terms(), wsp(), string("OR"), wsp(), terms(), push(match()));
  }

  @Label("terms")
  Rule terms() {
    return sequence(whiteSpaces(),
        join(term()).using(wsp()).min(0),
        whiteSpaces());
  }

  @Label("term")
  Rule term() {
    return sequence(oneOrMore(character()), push(match()));
  }

  Rule character() {
    return anyOf(
        "0123456789" +
        "abcdefghijklmnopqrstuvwxyz" +
        "ABCDEFGHIJKLMNOPQRSTUVWXYZ" +
        "-_");
  }

  @Label("whiteSpaces")
  Rule whiteSpaces() {
    return join(zeroOrMore(wsp())).using(sequence(optional(cr()), lf())).min(0);
  }

}

Can anyone point me in the right direction?

(author of grappa here...)

OK, so, what you seem to want is in fact a parse tree.

Very recently there has been an extension to grappa (2.0.x+) developed which can answer your needs: https://github.com/ChrisBrenton/grappa-parsetree .

Grappa, by default, only "blindly" matches text and has a stack at its disposal, so you could have, for instance:

public Rule oneOrOneOrEtc()
{
    return join(one(), push(match())).using(or()).min(1));
}

But then all of your matches would have been on the stack... Not very practical, but still usable in some situations (see, for instance, sonar-sslr-grappa ).

In your case you want this package. You can do this with it:

// define your root node
public final class Root
    extends ParseNode
{
    public Root(final String match, final List<ParseNode> children)
    {
        super(match, children);
    }
}

// define your parse node
public final class Alternative
    extends ParseNode
{
    public Alternative(final String match, final List<ParseNode> children)
    {
        super(match, children);
    }
}

That is the minimal implementation. And then your parser can look like this:

@GenerateNode(Alternative.class)
public Rule alternative() // or whatever
{
    return // whatever an alternative is
}

@GenerateNode(Root.class)
public Rule root
{
    return join(alternative())
        .using(or())
        .min(1);
}

What happens here is since the root node is matched before the alternative, if, say, you have a string:

a or b or c or d

then the root node will match the "whole sequence", and it will have four alternatives matching each a, b, c, and d.

Full credits here go to Christopher Brenton for coming up with this idea in the first place!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM