[英]Regular Expressions - tree grammar Antlr Java
I'm trying to write a program in ANTLR (Java) concerning simplifying regular expression. 我正在尝试使用ANTLR (Java)编写有关简化正则表达式的程序。 I have already written some code ( grammar file contents below)
我已经写了一些代码(下面的语法文件内容)
grammar Regexp_v7;
options{
language = Java;
output = AST;
ASTLabelType = CommonTree;
backtrack = true;
}
tokens{
DOT;
REPEAT;
RANGE;
NULL;
}
fragment
ZERO
: '0'
;
fragment
DIGIT
: '1'..'9'
;
fragment
EPSILON
: '@'
;
fragment
FI
: '%'
;
ID
: EPSILON
| FI
| 'a'..'z'
| 'A'..'Z'
;
NUMBER
: ZERO
| DIGIT (ZERO | DIGIT)*
;
WHITESPACE
: ('\r' | '\n' | ' ' | '\t' ) + {$channel = HIDDEN;}
;
list
: (reg_exp ';'!)*
;
term
: ID -> ID
| '('! reg_exp ')'!
;
repeat_exp
: term ('{' range_exp '}')+ -> ^(REPEAT term (range_exp)+)
| term -> term
;
range_exp
: NUMBER ',' NUMBER -> ^(RANGE NUMBER NUMBER)
| NUMBER (',') -> ^(RANGE NUMBER NULL)
| ',' NUMBER -> ^(RANGE NULL NUMBER)
| NUMBER -> ^(RANGE NUMBER NUMBER)
;
kleene_exp
: repeat_exp ('*'^)*
;
concat_exp
: kleene_exp (kleene_exp)+ -> ^(DOT kleene_exp (kleene_exp)+)
| kleene_exp -> kleene_exp
;
reg_exp
: concat_exp ('|'^ concat_exp)*
;
My next goal is to write down tree grammar code, which is able to simplify regular expressions (eg a|a -> a , etc.). 我的下一个目标是写下树语法代码,该代码能够简化正则表达式(例如a | a-> a等)。 I have done some coding (see text below), but I have troubles with defining rule that treats nodes as subtrees (in order to simplify following kind of expressions eg: (a|a)|(a|a) to a, etc.)
我已经完成了一些编码(请参见下面的文本),但是在定义将节点视为子树的规则时遇到了麻烦(以简化以下类型的表达式,例如:(a | a)|(a | a)到a等)。 )
tree grammar Regexp_v7Walker;
options{
language = Java;
tokenVocab = Regexp_v7;
ASTLabelType = CommonTree;
output=AST;
backtrack = true;
}
tokens{
NULL;
}
bottomup
: ^('*' ^('*' e=.)) -> ^('*' $e) //a** -> a*
| ^('|' i=.* j=.* {$i.tree.toStringTree() == $j.tree.toStringTree()} )
-> $i // There are 3 errors while this line is up and running:
// 1. CommonTree cannot be resolved,
// 2. i.tree cannot be resolved or is not a field,
// 3. i cannot be resolved.
;
Small driver class: 小驾驶员类:
public class Regexp_Test_v7 {
public static void main(String[] args) throws RecognitionException {
CharStream stream = new ANTLRStringStream("a***;a|a;(ab)****;ab|ab;ab|aa;");
Regexp_v7Lexer lexer = new Regexp_v7Lexer(stream);
CommonTokenStream tokenStream = new CommonTokenStream(lexer);
Regexp_v7Parser parser = new Regexp_v7Parser(tokenStream);
list_return list = parser.list();
CommonTree t = (CommonTree) list.getTree();
System.out.println("Original tree: " + t.toStringTree());
CommonTreeNodeStream nodes = new CommonTreeNodeStream(t);
Regexp_v7Walker s = new Regexp_v7Walker(nodes);
t = (CommonTree)s.downup(t);
System.out.println("Simplified tree: " + t.toStringTree());
Can anyone help me with solving this case? 谁能帮我解决这个问题? Thanks in advance and regards.
在此先感谢和问候。
Now, I'm no expert, but in your tree grammar: 现在,我不是专家,但是在您的树语法中:
filter=true
filter=true
bottomup
rule to: bottomup
规则的第二行更改为: ^('|' i=. j=. {i.toStringTree().equals(j.toStringTree()) }? ) -> $i }
If I'm not mistaken by using i=.*
you're allowing i
to be non-existent and you'll get a NullPointerException
on conversion to a String
. 如果我没有错误地使用
i=.*
,则表示i
不存在,并且在转换为String
会得到NullPointerException
。
Both i
and j
are of type CommonTree
because you've set it up this way: ASTLabelType = CommonTree
, so you should call i.toStringTree()
. i
和j
均为CommonTree
类型,因为您已通过以下方式进行设置: ASTLabelType = CommonTree
,因此应调用i.toStringTree()
。
And since it's Java and you're comparing Strings, use equals()
. 并且由于它是Java,并且您正在比较字符串,因此请使用
equals()
。
Also to make the expression in curly brackets a predicate, you need a question mark after the closing one. 另外,要使大括号中的表达式成为谓词,在结束括号后需要一个问号。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.