[英]How to get line number in ANTLR3 tree-parser @init action
In ANTLR, version 3, how can the line number be obtained in the @init action of a high-level tree-parser rule? 在ANTLR版本3中,如何在高级树解析器规则的@init操作中获取行号?
For example, in the @init action below, I'd like to push the line number along with the sentence text. 例如,在下面的@init操作中,我想将行号与句子文本一起推送。
sentence
@init { myNodeVisitor.pushScriptContext( new MyScriptContext( $sentence.text )); }
: assignCommand
| actionCommand;
finally {
m_nodeVisitor.popScriptContext();
}
I need to push the context before the execution of the actions associated with symbols in the rules. 我需要在执行与规则中的符号相关联的操作之前推送上下文。
Some things that don't work: 有些事情不工作:
$sentence.line
-- it's not defined, even though $sentence.text
is. $sentence.line
- 它没有定义,即使$sentence.text
是。 getTreeNodeStream().getTreeAdaptor().getToken( $sentence.start ).getLine()
. getTreeNodeStream().getTreeAdaptor().getToken( $sentence.start ).getLine()
。 EDIT: Actually, this does work, if $sentence.start is either a real token or an imaginary with a reference -- see Bart Kiers answer below. It seems like if I can easily get, in the @init rule, the matched text and the first matched token, there should be an easy way to get the line number as well. 似乎我可以很容易地在@init规则中获得匹配的文本和第一个匹配的标记,因此应该有一种简单的方法来获取行号。
You can look 1 step ahead in the token/tree-stream of a tree grammar using the following: CommonTree ahead = (CommonTree)input.LT(1)
, which you can place in the @init
section. 您可以使用以下内容在树语法的令牌/树流中向前看1步:
CommonTree ahead = (CommonTree)input.LT(1)
,您可以将其@init
部分中。
Every CommonTree
(the default Tree
implementation in ANTLR) has a getToken()
method which return the Token
associated with this tree. 每个
CommonTree
(ANTLR中的默认Tree
实现)都有一个getToken()
方法,该方法返回与此树关联的Token
。 And each Token
has a getLine()
method, which, not surprisingly, returns the line number of this token. 并且每个
Token
都有一个getLine()
方法,毫不奇怪,它返回此令牌的行号。
So, if you do the following: 因此,如果您执行以下操作:
sentence
@init {
CommonTree ahead = (CommonTree)input.LT(1);
int line = ahead.getToken().getLine();
System.out.println("line=" + line);
}
: assignCommand
| actionCommand
;
you should be able to see some correct line numbers being printed. 你应该能够看到正在打印一些正确的行号。 I say some , because this won't go as planned in all cases.
我说一些 ,因为在所有情况下都不会按计划进行。 Let me demonstrate using a simple example grammar:
让我演示使用一个简单的示例语法:
grammar ASTDemo;
options {
output=AST;
}
tokens {
ROOT;
ACTION;
}
parse
: sentence+ EOF -> ^(ROOT sentence+)
;
sentence
: assignCommand
| actionCommand
;
assignCommand
: ID ASSIGN NUMBER -> ^(ASSIGN ID NUMBER)
;
actionCommand
: action ID -> ^(ACTION action ID)
;
action
: START
| STOP
;
ASSIGN : '=';
START : 'start';
STOP : 'stop';
ID : ('a'..'z' | 'A'..'Z')+;
NUMBER : '0'..'9'+;
SPACE : (' ' | '\t' | '\r' | '\n')+ {skip();};
whose tree grammar looks like: 其树语法如下:
tree grammar ASTDemoWalker;
options {
output=AST;
tokenVocab=ASTDemo;
ASTLabelType=CommonTree;
}
walk
: ^(ROOT sentence+)
;
sentence
@init {
CommonTree ahead = (CommonTree)input.LT(1);
int line = ahead.getToken().getLine();
System.out.println("line=" + line);
}
: assignCommand
| actionCommand
;
assignCommand
: ^(ASSIGN ID NUMBER)
;
actionCommand
: ^(ACTION action ID)
;
action
: START
| STOP
;
And if you run the following test class: 如果您运行以下测试类:
import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
public class Main {
public static void main(String[] args) throws Exception {
String src = "\n\n\nABC = 123\n\nstart ABC";
ASTDemoLexer lexer = new ASTDemoLexer(new ANTLRStringStream(src));
ASTDemoParser parser = new ASTDemoParser(new CommonTokenStream(lexer));
CommonTree root = (CommonTree)parser.parse().getTree();
ASTDemoWalker walker = new ASTDemoWalker(new CommonTreeNodeStream(root));
walker.walk();
}
}
you will see the following being printed: 你会看到以下内容被打印出来:
line=4
line=0
As you can see, "ABC = 123"
produced the expected output (line 4), but "start ABC"
didn't (line 0). 如您所见,
"ABC = 123"
产生预期输出(第4行),但"start ABC"
没有产生(第0行)。 This is because the root of the action
rule is a ACTION
token and this token is never defined in the lexer, only in the tokens{...}
block. 这是因为
action
规则的根是一个ACTION
标记,并且该标记永远不会在词法分析器中定义,只能在tokens{...}
块中定义。 And because it doesn't really exist in the input, by default the line 0 is attached to it. 并且因为输入中并不存在,所以默认情况下会将0行附加到输入中。 If you want to change the line number, you need to provide a "reference" token as a parameter to this so called imaginary
ACTION
token which it uses to copy attributes into itself. 如果要更改行号,则需要提供一个“引用”标记作为此所谓的虚构
ACTION
标记的参数,该标记用于将属性复制到自身中。
So, if you change the actionCommand
rule in the combined grammar into: 因此,如果您将组合语法中的
actionCommand
规则更改为:
actionCommand
: ref=action ID -> ^(ACTION[$ref.start] action ID)
;
the line number would be as expected (line 6). 行号将如预期的那样(第6行)。
Note that every parser rule has a start
and end
attribute which is a reference to the first and last token, respectively. 请注意,每个解析器规则都有一个
start
和end
属性,分别是对第一个和最后一个令牌的引用。 If action
was a lexer rule (say FOO
), then you could have omitted the .start
from it: 如果
action
是lexer规则(比如FOO
),那么你可以省略它的.start
:
actionCommand
: ref=FOO ID -> ^(ACTION[$ref] action ID)
;
Now the ACTION
token has copied all attributes from whatever $ref
is pointing to, except the type of the token, which is of course int ACTION
. 现在,
ACTION
令牌已经复制了$ref
指向的所有属性,除了令牌的类型,当然是int ACTION
。 But this also means that it copied the text
attribute, so in my example, the AST created by ref=action ID -> ^(ACTION[$ref.start] action ID)
could look like: 但这也意味着它复制了
text
属性,所以在我的例子中,由ref=action ID -> ^(ACTION[$ref.start] action ID)
创建的AST可能如下所示:
[text=START,type=ACTION]
/ \
/ \
/ \
[text=START,type=START] [text=ABC,type=ID]
Of course, it's a proper AST because the types of the nodes are unique, but it makes debugging confusing since ACTION
and START
share the same .text
attribute. 当然,它是一个合适的AST,因为节点的类型是唯一的,但它使调试混乱,因为
ACTION
和START
共享相同的.text
属性。
You can copy all attributes to an imaginary token except the .text
and .type
by providing a second string parameter, like this: 您可以通过提供第二个字符串参数将所有属性复制到除
.text
和.type
之外的虚构标记,如下所示:
actionCommand
: ref=action ID -> ^(ACTION[$ref.start, "Action"] action ID)
;
And if you now run the same test class again, you will see the following printed: 如果您现在再次运行相同的测试类,您将看到以下内容:
line=4
line=6
And if you inspect the tree that is generated, it'll look like this: 如果你检查生成的树,它将如下所示:
[type=ROOT, text='ROOT']
[type=ASSIGN, text='=']
[type=ID, text='ABC']
[type=NUMBER, text='123']
[type=ACTION, text='Action']
[type=START, text='start']
[type=ID, text='ABC']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.