简体   繁体   中英

ANTLR : No viable alternative error '{“type”'

I know there are a lot of this questions and we've been going through them all but we can't seem to find a solution that fits our needs.

We wrote a simple grammar for a javascript to Java converter, the lexer and the parser and we can't get it to consume the first token of our input file correctly.

Here's the grammar:

lexer grammar JS2JAVALexer;

STRING
   : '"' (ESC | ~ ["\\])* '"'
   ;
fragment ESC
   : '\\' (["\\/bfnrt] | UNICODE)
   ;
fragment UNICODE
   : 'u' HEX HEX HEX HEX
   ;
fragment HEX
   : [0-9a-fA-F]
   ;
NUMBER
   : '-'? INT '.' [0-9] + EXP? | '-'? INT EXP | '-'? INT
   ;
fragment INT
   : '0' | [1-9] [0-9]*
   ;
// no leading zeros
fragment EXP
   : [Ee] [+\-]? INT
   ;
// \- since - means "range" inside [...]
WS
   : [ \t\n\r] + -> skip
   ;

OPENPAR : '(' ;
CLOSEPAR : ')' ;
OPENBRACES : '{' ;
CLOSEBRACES : '}' ;
OPENBRACKETS : '[' ;
CLOSEBRACKETS : ']' ;
TWOPOINTS : ':' ;
QUOTATION_MARK : '"';
COMMA : ',' ;
TRUE : 'true';

FALSE : 'false';
NULL : 'null' ;

TYPE : 'type';
SOURCETYPE : '"sourceType"';
BODY : '"body"';

After running it we get the error "line 2:4 no viable alternative at input '{"type"'

This is our input file:

{
    "type": "Program",
    "body": [
        {
            "type": "FunctionDeclaration",
            "id": {
                "type": "Identifier",
                "name": "name"
            },
            "params": [
                {
                    "type": "Identifier",
                    "name": "arg1"
                },
                {
                    "type": "Identifier",
                    "name": "arg2"
                }
            ],
            "defaults": [],
            "body": {
                "type": "BlockStatement",
                "body": [
                    {
                        "type": "VariableDeclaration",
                        "declarations": [
                            {
                                "type": "VariableDeclarator",
                                "id": {
                                    "type": "Identifier",
                                    "name": "x"
                                },
                                "init": {
                                    "type": "Literal",
                                    "value": 1,
                                    "raw": "1"
                                }
                            }
                        ],
                        "kind": "var"
                    },
                    {
                        "type": "VariableDeclaration",
                        "declarations": [
                            {
                                "type": "VariableDeclarator",
                                "id": {
                                    "type": "Identifier",
                                    "name": "y"
                                },
                                "init": {
                                    "type": "Literal",
                                    "value": 2,
                                    "raw": "2"
                                }
                            }
                        ],
                        "kind": "var"
                    },
                    {
                        "type": "ReturnStatement",
                        "argument": {
                            "type": "BinaryExpression",
                            "operator": "+",
                            "left": {
                                "type": "Identifier",
                                "name": "x"
                            },
                            "right": {
                                "type": "Identifier",
                                "name": "y"
                            }
                        }
                    }
                ]
            },
            "generator": false,
            "expression": false
        }
    ],
    "sourceType": "script"
}

Parser's code:

parser grammar JS2JAVAParser;

options {
    tokenVocab = JS2JAVALexer;
}

json
   : object
   | array
   ;

object
   : OPENBRACES pair (COMMA pair)* CLOSEBRACES
   | OPENBRACES CLOSEBRACES
   ;

left_operand 
    : QUOTATION_MARK left_name QUOTATION_MARK
    ;

left_name 
    : TYPE 
    | BODY 
    | SOURCETYPE
    | DECLARATIONS
    | ID
    | INIT
    | OPERATOR
    | LEFT
    | RIGHT
    | VALUE
    | RAW
    | KIND
    ;

pair
   : left_operand TWOPOINTS value
   ;

array
   : OPENBRACKETS value (COMMA value)* CLOSEBRACKETS
   | OPENBRACKETS CLOSEBRACKETS
   ;

value
   : QUOTATION_MARK value_name QUOTATION_MARK
   | object
   | array
   | TRUE
   | FALSE
   | NULL
   | STRING
   | IDENTIFIER
   | LITERAL
   | VAR
   | STRING
   ;

value_name 
    : SCRIPT
    | PROGRAM
    ;

Sorry for such a long and repetitive question but we're running out of ideas. Thank you in advance for your patience guys.

The problem is that in your top level parser rule "type" is recognized as STRING token. Actually, seems that everything that could be STRING or something else will be recognized as STRING. So basically you need to resolve the lexer ambiguity. By rewriting lexer rules, or potentially using lexer modes.

Also at least as a reference, there is this JSON grammar on ANTLR's github repo.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM