简体   繁体   中英

Antlr : beginner 's mismatched input expecting ID

As a beginner, when I was learning ANTLR4 from the The Definitive ANTLR 4 Reference book, I tried to run my modified version of the exercise from Chapter 7:

 * to parse properties file
 * this example demonstrates using embedded actions in code
grammar PropFile;

@header  {
    import java.util.Properties;
@members {
    Properties props = new Properties();
        System.out.println("Loading file...");

        props.setProperty($ID.getText(),$STRING.getText());//add one property

ID  : [a-zA-Z]+ ;
STRING  :(~[\r\n])+; //if use  STRING : '"' .*? '"'  everything is fine
NEWLINE :   '\r'?'\n' ;

Since Java properties are just key-value pair I use STRING to match eveything except NEWLINE (I don't want it to just support strings in the double-quotes). When running following sentence, I got:

D:\Antlr\Ex\PropFile\Prop1>grun PropFile prop -tokens
line 1:0 mismatched input 'driver=mysql' expecting ID

When I use STRING : '"' .*? '"' instead, it works.

I would like to know where I was wrong so that I can avoid similar mistakes in the future.

Please give me some suggestion, thank you!

Since both ID and STRING can match the input text starting with "driver", the lexer will choose the longest possible match, even though the ID rule comes first.

So, you have several choices here. The most direct is to remove the ambiguity between ID and STRING (which is how your alternative works) by requiring the string to start with the equals sign.

file : prop+ EOF ;

ID      : [a-zA-Z]+ ;
STRING  : '=' (~[\r\n])+;
NEWLINE : '\r'?'\n' ;

You can then use an action to trim the equals sign from the text of the string token.

Alternately, you can use a predicate to disambiguate the rules.

file : prop+ EOF ;
prop : ID '=' STRING NEWLINE ;

ID      : [a-zA-Z]+ ;
STRING  : { isValue() }? (~[\r\n])+; 
NEWLINE : '\r'?'\n' ;

where the isValue method looks backwards on the character stream to verify that it follows an equals sign. Something like:

@members {
public boolean isValue() {
    int offset = _tokenStartCharIndex;
    for (int idx = offset-1; idx >=0; idx--) {
        String s = _input.getText(Interval.of(idx, idx));
        if (Character.isWhitespace(s.charAt(0))) {
        } else if (s.charAt(0) == '=') {
            return true;
        } else {
    return false;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM