简体   繁体   English

使用ANTLR获取树结构

[英]Getting tree construction with ANTLR

As asked and answered in Removing Left Recursion in ANTLR , I could remove the left recursion 在ANTLR删除左递归中所要求和回答的,我可以删除左递归

E -> E + T|T
T -> T * F|F
F -> INT | ( E )

After left recursion removal, I get the following one 左递归删除后,我得到以下之一

E -> TE'
E' -> null | + TE'
T -> FT'
T' -> null | * FT'

Then, how to make the tree construction with the modified grammar? 然后,如何用修改后的语法制作树结构? With the input 1+2, I want to have a tree 用输入1 + 2,我想有一棵树

^('+' ^(INT 1) ^(INT 2))
. Or similar. 或类似的。

\ngrammar T; 语法T;\n\noptions { 选项{\n    output=AST; 输出= AST;\n    language=Python; 语言= Python的;\n    ASTLabelType=CommonTree; ASTLabelType = CommonTree;\n} }\n\nstart : e -> e 开始:e-> e\n   ; ;\ne : t ep -> ??? e:t ep-> ???\n   ; ;\nep : ep: \n   | | '+' t ep -> ??? '+'t ep-> ???\n   ; ;\nt : f tp -> ??? t:f tp-> ???\n  ; ;\ntp : 吨: \n  | | '*' f tp -> ??? '*'f tp-> ???\n  ; ;\nf : INT f:整数 \n  | | '(' e ')' -> e '('e')'-> e\n  ; ;\n\nINT : '0'..'9'+ ; INT:'0'..'9'+;\nWS: (' '|'\\n'|'\\r')+ {$channel=HIDDEN;} ; WS:(''|'\\ n'|'\\ r')+ {$ channel = HIDDEN;};\n

A bit of opinion: although it's sometimes possible to go from an LR grammar to an LL grammar, as you have done, the result isn't as idiomatic and may seem like a strange way to define your grammar to someone familiar with LL grammars. 一点意见:尽管有时您可以从LR语法转换为LL语法,但结果却不是惯用的,对于熟悉LL语法的人来说,定义语法的方式似乎很奇怪。

For example, consider the following excerpt from above: 例如,请考虑以下摘录:

tp : 
  | '*' f tp -> ???

The above accepts a * followed by an f whose FIRST set will contain INT or ( , the start of itself as its right recursive. Thus, you'll never see the start of the expression you want rooted at * , which will make it much more difficult than it needs to be to build the tree you want. 上面的代码接受一个*然后是一个f其FIRST集将包含INT( ,作为其右递归的开始。因此,您将永远不会看到想要以*根的表达式的开始比构建所需的树要困难得多。

To make it easy to create that AST in ANTLR, you want to have both the operands and the operator. 为了在ANTLR中轻松创建AST,您需要同时具有操作数和运算符。

add:
   INT '+'^ INT;

The caret, ^ makes the + the root of the tree and the two INT s become its children. 尖号^使树的+成为根,两个INT成为其子代。

The example Bart K linked to is a great example of how I'd expect to see it done with an LL grammar ... and it scales to support operators of different precedence. 与Bart K 链接的示例是一个很好的示例,说明了我希望如何用LL语法完成它...并且它可以扩展以支持不同优先级的运算符。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM