简体   繁体   中英

ctx in ANTLR4 javascript visitor

Using ANTLR4 v4.8

I am in the process of writing transpiler exploring use of ANTLR (javascript target with visitor).

Grammar -> lex/parse is fine and I now sit on parse tree.

Grammar

grammar Mygrammar;

/*
 * parser rules
 */

progm   : stmt+;

stmt
: progdecl
| print
;

progdecl : PROGDECLKW ID '..';
print    : WRITEKW STRLIT '..';

/*
 * lexer rules
 */

PROGDECLKW  : 'DECLAREPROGRAM';
WRITEKW     : 'PRINT';

// Literal
STRLIT             : '\'' .*? '\'' ;

// Identifier 
ID              : [a-zA-Z0-9]+;

// skip
LINE_COMMENT    : '*' .*? '\n' -> skip;
TERMINATOR      : [\r\n]+ -> skip;
WS              : [ \t\n\r]+ -> skip;

hw.mg

***************
* Hello world
***************

DECLAREPROGRAM  hw..

PRINT 'Hello World!'..

index.js

...
const myVisitor = require('./src/myVisitor').myVisitor;

const input = './src_sample/hw.mg';
const chars = new antlr4.FileStream(input);
...
parser.buildParseTrees = true;

const myVisit = new myVisitor();
myVisit.visitPrint(parser.print());

Use of visitor didn't seem straightforward, and this SO post helps to an extent.

On use of context . Is there a good way to track ctx, when I hit each node?
Using myVisit.visit(tree) as starting context is fine. When I start visiting each node, using non-root context
myVisit.visitPrint(parser.print()) throws me error.

Error:

PrintContext {
  parentCtx: null,
  invokingState: -1,
  ruleIndex: 3,
  children: null,
  start: CommonToken {
    source: [ [MygrammarLexer], [FileStream] ],
    type: -1,
    channel: 0,
    start: 217,

together with exception: InputMismatchException [Error]
I believe it is because children is null instead of being populated.
Which, in turn, is due to line 9:0 mismatched input '<EOF>' expecting {'DECLAREPROGRAM', 'PRINT'}

Question:
Is above the only way to pass the context or am I doing this wrong? If the use is correct, then I incline towards looking at reporting this as bug.

edit 17.3 - added grammar and source

When you invoke parser.print() but feed it the input:

***************
* Hello world
***************

DECLAREPROGRAM  hw..

PRINT 'Hello World!'..

it will not work. For print() , the parser expects input like this PRINT 'Hello World!'.. . For the entire input, you will have to invoke prog() instead. Also, it is wise to "anchor" your starting rule with the EOF token which will force ANTLR to consume the entire input:

progm : stmt+ EOF;

If you want to parse and visit an entire parse tree (using prog() ), but are only interested in the print node/context, then it is better to use a listener instead of a visitor. Check this page how to use a listener: https://github.com/antlr/antlr4/blob/master/doc/javascript-target.md

EDIT

Here's how a listener works (a Python demo since I don't have the JS set up properly):

import antlr4

from playground.MygrammarLexer import MygrammarLexer
from playground.MygrammarParser import MygrammarParser
from playground.MygrammarListener import MygrammarListener


class PrintPreprocessor(MygrammarListener):
    def enterPrint_(self, ctx: MygrammarParser.Print_Context):
        print("Entered print: `{}`".format(ctx.getText()))


if __name__ == '__main__':
    source = """
        ***************
        * Hello world
        ***************

        DECLAREPROGRAM  hw..

        PRINT 'Hello World!'..
    """
    lexer = MygrammarLexer(antlr4.InputStream(source))
    parser = MygrammarParser(antlr4.CommonTokenStream(lexer))
    antlr4.ParseTreeWalker().walk(PrintPreprocessor(), parser.progm())

When running the code above, the following will be printed:

Entered print: `PRINT'Hello World!'..`

So, in short: this listener accepts the entire parse tree of your input, but only "listens" when we enter the print parser rule.

Note that I renamed print to print_ because print is protected in the Python target.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM