antlr4: operator precedence changes

Question

I have a question concerning antlr4 and the precedence of its tokens. I have the following grammar:

grammar TestGrammar;

@header {
package some.package;
}

fragment A : ('A'|'a') ;
fragment E : ('E'|'e') ;
fragment F : ('F'|'f') ;
fragment L : ('L'|'l') ;
fragment R : ('R'|'r') ;
fragment S : ('S'|'s') ;
fragment T : ('T'|'t') ;
fragment U : ('U'|'u') ;
BOOL : (T R U E | F A L S E) ;

AND : '&' ;
OR : '|' ;
IMPLIES : '=>' ;

AS : 'als' ;

ID : [a-zA-Z_][a-zA-Z0-9_]+ ;

value_assignment : AS name=ID ;

formula  :
  BOOL /*(variable=value_assignment)?*/  #ExpressionBoolean
  | identifier=ID /*(variable=value_assignment)?*/  #ExpressionIdentifier
  | leftFormula=formula operator=(AND | OR) rightFormula=formula /*(variable=value_assignment)?*/  #ExpressionBinaryAndOr
  | leftFormula=formula operator=IMPLIES rightFormula=formula /*(variable=value_assignment)?*/  #ExpressionBinaryImplies
;

It is used to do some propositional logic. I want it to first evaluate and or or , and the implication afterwards. If I am using the proposed grammar, it works as expected. Please notice, that the value_assignment -rules are commented out. I have some test-cases to play around with the functionality:

public class TestGrammarTest {

    private static ParserRuleContext parse(final String input) {
        final TestGrammarLexer lexer = new TestGrammarLexer(CharStreams.fromString(input));
        final CommonTokenStream tokens = new CommonTokenStream(lexer);
        return new TestGrammarParser(tokens).formula();
    }

    private static Set<Object> states() {
        final Set<Object> states = new HashSet<Object>();

        states.add(0);
        states.add(1);
        states.add(2);

        return states;
    }

    @DataProvider (name = "testEvaluationData")
    public Object[][] testEvaluationData() {
        return new Object [][] {
            {"true & false => true", states(), states()},
            {"false & true => true", states(), states()},
        };
    }

    @Test (dataProvider = "testEvaluationData")
    public void testEvaluation(final String input, final Set<Object> states, final Set<Object> expectedResult) {
        System.out.println("test evaluation of <" + input + ">");
        Assert.assertEquals(
                new TestGrammarVisitor(states).visit(parse(input)),
                expectedResult
            );
    }

}

I think that I also need to show (a smaller version of) my visitor to make the behaviour clear. The implementation is straight-forward as you would expect it:

public class TestGrammarVisitor extends TestGrammarBaseVisitor<Set<Object>> {

    final Set<Object> states;

    public TestGrammarVisitor(final Set<Object> theStates) {
        states = Collections.unmodifiableSet(theStates);
    }

    @Override
    public Set<Object> visitExpressionBoolean(final ExpressionBooleanContext ctx) {
        System.out.println("\nvisitExpressionBoolean called ...\n");
        final TerminalNode node = ctx.BOOL();
        final Set<Object> result;
        if (node.getText().equalsIgnoreCase("true")) {
            result = new HashSet<>(states);
            return result;
        }
        result = Collections.emptySet();
        return result;
    }

    @Override
    public Set<Object> visitExpressionBinaryAndOr(final ExpressionBinaryAndOrContext ctx) {
        System.out.println("\nvisitExpressionBinaryAndOr called ...\n");
        final Set<Object> result = new HashSet<>(super.visit(ctx.leftFormula));
        switch (ctx.operator.getText()) {
        case "&":
            result.retainAll(super.visit(ctx.rightFormula));
            return result;
        case "|":
            result.addAll(super.visit(ctx.rightFormula));
            return result;
        default:
            throw new UnsupportedOperationException();
        }
    }

    @Override
    public Set<Object> visitExpressionBinaryImplies(final ExpressionBinaryImpliesContext ctx) {
        System.out.println("\nvisitExpressionBinaryImplies called ...\n");
        final Set<Object> result = new HashSet<>(states);
        result.removeAll(super.visit(ctx.leftFormula));
        result.addAll(super.visit(ctx.rightFormula));
        return result;
    }

    @Override
    protected Set<Object> aggregateResult(Set<Object> aggregate, Set<Object> nextResult) {
        if (aggregate == null) {
            return nextResult;
        }
        if (nextResult == null) {
            return aggregate;
        }
        Set<Object> clone = new HashSet<>(aggregate);
        clone.addAll(nextResult);
        return clone;
    }

}

I use the println-statements in order to see when the different rules will be called. If I test the shown grammar where the

(variable=value_assignment)?

are commented out, the output is as expected:

test evaluation of <true & false => true>
visitExpressionBinaryImplies called ...
visitExpressionBinaryAndOr called ...
visitExpressionBoolean called ...
visitExpressionBoolean called ...
visitExpressionBoolean called ...

test evaluation of <false & true => true>
visitExpressionBinaryImplies called ...
visitExpressionBinaryAndOr called ...
visitExpressionBoolean called ...
visitExpressionBoolean called ...
visitExpressionBoolean called ...

PASSED: testEvaluation("true & false => true", [0, 1, 2], [0, 1, 2])
PASSED: testEvaluation("false & true => true", [0, 1, 2], [0, 1, 2])

But, when I include those statements, the precedence changes:

test evaluation of <true & false => true>
visitExpressionBinaryAndOr called ...
visitExpressionBoolean called ...
visitExpressionBinaryImplies called ...
visitExpressionBoolean called ...
visitExpressionBoolean called ...

test evaluation of <false & true => true>
visitExpressionBinaryAndOr called ...
visitExpressionBoolean called ...
visitExpressionBinaryImplies called ...
visitExpressionBoolean called ...
visitExpressionBoolean called ...

PASSED: testEvaluation("true & false => true", [0, 1, 2], [0, 1, 2])
FAILED: testEvaluation("false & true => true", [0, 1, 2], [0, 1, 2])
java.lang.AssertionError: Sets differ: expected [0, 1, 2] but got []

As you can see, the implication will be called AFTER the conjunction, which is not what I want. Also, the first test case passes the test by accident, since the intended operator precedence will not be met. Can anyone explain to me why the operator precedence changes because of using the value_assignment -rule (I just deleted the comment symbols around it)?

Thank you very much for your help!

Answer 1

After some attempts, I managed to ship around the problem as follows:

grammar TestGrammar;

@header {
package some.package;
}

fragment A : ('A'|'a') ;
fragment E : ('E'|'e') ;
fragment F : ('F'|'f') ;
fragment L : ('L'|'l') ;
fragment R : ('R'|'r') ;
fragment S : ('S'|'s') ;
fragment T : ('T'|'t') ;
fragment U : ('U'|'u') ;
BOOL : (T R U E | F A L S E) ;

AND : '&' ;
OR : '|' ;
IMPLIES : '=>' ;

AS : 'als' ;

ID : [a-zA-Z_][a-zA-Z0-9_]+ ;

formula  :
  BOOL #ExpressionBoolean
  | leftFormula=formula operator=(AND | OR) rightFormula=formula #ExpressionBinaryAndOr
  | leftFormula=formula operator=IMPLIES rightFormula=formula #ExpressionBinaryImplies
  | innerFormula=formula AS storageName=ID  #ExpressionAssignment
  | identifier=ID #ExpressionIdentifier
;

So, I will handle the storage ability as a separate formula. This is not exactly what I wanted to do (it forces me to provide the storage option to each sub-formula, and I have to manage it in the visitor if a storing behaviour is desired or not for a specific sub-formula). But, I can live with that work-around.

antlr4: operator precedence changes

Question

1 answers

solution1
1 ACCPTED 2020-10-09 06:02:42

antlr4: operator precedence changes

Question

1 answers

solution1 1 ACCPTED 2020-10-09 06:02:42

solution1
1 ACCPTED 2020-10-09 06:02:42