简体   繁体   中英

antlr literal string matching: what am I doing wrong?

I've been using antlr for 3 days. I can parse expressions, write Listeners, interpret parse trees... it's a dream come true.

But then I tried to match a literal string 'foo%' and I'm failing. I can find plenty of examples that claim to do this. I have tried them all.

So I created a tiny project to match a literal string. I must be doing something silly.

grammar Test;

clause
  : stringLiteral EOF
  ;

fragment ESCAPED_QUOTE : '\\\'';
stringLiteral :   '\'' ( ESCAPED_QUOTE | ~('\n'|'\r') ) + '\'';

Simple test:

public class Test {

    @org.junit.Test
    public void test() {
        String input = "'foo%'";
        TestLexer lexer = new TestLexer(new ANTLRInputStream(input));
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        TestParser parser = new TestParser(tokens);
        ParseTree clause = parser.clause();
        System.out.println(clause.toStringTree(parser));

        ParseTreeWalker walker = new ParseTreeWalker();
    }
}

The result:

Running com.example.Test
line 1:1 token recognition error at: 'f'
line 1:2 token recognition error at: 'o'
line 1:3 token recognition error at: 'o'
line 1:4 token recognition error at: '%'
line 1:6 no viable alternative at input '<EOF>'
(clause (stringLiteral ' ') <EOF>)
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.128 sec - in com.example.Test

Results :

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

The full maven-ized build tree is available for a quick review here

31 lines of code... most of it borrowed from small examples.

 $ mvn clean test

Using antlr-4.5.2-1.

fragment rules can only be used by other lexer rules. So, you need to make stringLiteral a lexer rule instead of a parser rule. Just let it start with an upper case letter.

Also, it's better to expand your negated class ~('\\n'|'\\r') to include a backslash and quote, and you might want to include a backslash to be able to be escaped:

clause
  : StringLiteral EOF
  ;

StringLiteral :   '\'' ( Escape | ~('\'' | '\\' | '\n' | '\r') ) + '\'';

fragment Escape : '\\' ( '\'' | '\\' );

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM