[英]How to print parser tree in Yacc (BISON)
I made a parser for the C- language using BISON and FlEX. 我使用BISON和FlEX为C语言编写了一个解析器。 It works and prints "syntax error" in terminal if given c- input code is syntactically wrong, otherwise print nothing.
如果给定的c-输入代码在语法上错误,它在终端中工作并打印“语法错误”,否则不打印。
But i want to print the parser tree relevant to given c- input code as the output of my parser. 但我想打印与给定c输入代码相关的解析器树作为我的解析器的输出。 How do i do that?
我怎么做? Is there function in BISON which can be used to print the parser tree?
BISON中是否有可用于打印解析器树的功能?
The TXR language (http://www.nongnu.org/txr) uses Flex and Yacc for parsing its input. TXR语言(http://www.nongnu.org/txr)使用Flex和Yacc来解析其输入。 You can see the parse tree if you give it the
-v
option. 如果给它
-v
选项,则可以看到解析树。
Eg: 例如:
$ ./txr -v -c "@/[a-z]*|foo/"
spec:
(((text (#<sys:regex: 9d99268> or (0+ (set (#\a . #\z))) (compound #\f #\o #\o)))))
You construct the tree in the parser actions and print it yourself with a tree-printing routine. 您可以在解析器操作中构造树,并使用树形打印例程自行打印。 I used a Lisp-like object representation to make life easier.
我使用类似Lisp的对象表示来使生活更轻松。 Writing this out is handled by a recursive printing function which recognizes all the possible object types and renders them into notation.
写出来的是由递归打印函数处理,该函数识别所有可能的对象类型并将它们呈现为符号。 For instance above you see objects of character type printed with a hash-backslash notation, and the unprintable, opaque, compiled regex is printed using the notation
#< ... >
. 例如,您可以看到使用散列反斜杠表示法打印的字符类型对象,并使用符号
#< ... >
打印不可打印,不透明,已编译的正则表达式。
Here is a part of the grammar: 这是语法的一部分:
regexpr : regbranch { $$ = if3(cdr($1),
cons(compound_s, $1),
car($1)); }
| regexpr '|' regexpr { $$ = list(or_s, $1, $3, nao); }
| regexpr '&' regexpr { $$ = list(and_s, $1, $3, nao); }
| '~' regexpr { $$ = list(compl_s, $2, nao); }
| /* empty */ %prec LOW { $$ = nil; }
;
As you can see, constructing the AST is largely just simple construction of nested lists. 如您所见,构建AST主要是嵌套列表的简单构造。 This form is very convenient to compile.
这种形式编译非常方便。 The top-level function of the NFA-based regex compiler is very readable:
基于NFA的正则表达式编译器的顶级函数非常易读:
/*
* Input is the items from a regex form,
* not including the regex symbol.
* I.e. (rest '(regex ...)) not '(regex ...).
*/
static nfa_t nfa_compile_regex(val exp)
{
if (nullp(exp)) {
nfa_state_t *acc = nfa_state_accept();
nfa_state_t *s = nfa_state_empty(acc, 0);
return nfa_make(s, acc);
} else if (typeof(exp) == chr_s) {
nfa_state_t *acc = nfa_state_accept();
nfa_state_t *s = nfa_state_single(acc, c_chr(exp));
return nfa_make(s, acc);
} else if (exp == wild_s) {
nfa_state_t *acc = nfa_state_accept();
nfa_state_t *s = nfa_state_wild(acc);
return nfa_make(s, acc);
} else {
val sym = first(exp), args = rest(exp);
if (sym == set_s) {
return nfa_compile_set(args, nil);
} else if (sym == cset_s) {
return nfa_compile_set(args, t);
} else if (sym == compound_s) {
return nfa_compile_list(args);
} else if (sym == zeroplus_s) {
nfa_t nfa_arg = nfa_compile_regex(first(args));
nfa_state_t *acc = nfa_state_accept();
/* New start state has empty transitions going through
the inner NFA, or skipping it right to the new acceptance state. */
nfa_state_t *s = nfa_state_empty(nfa_arg.start, acc);
/* Convert acceptance state of inner NFA to one which has
an empty transition back to the start state, and
an empty transition to the new acceptance state. */
nfa_state_empty_convert(nfa_arg.accept, nfa_arg.start, acc);
return nfa_make(s, acc);
} else if (sym == oneplus_s) {
/* One-plus case differs from zero-plus in that the new start state
does not have an empty transition to the acceptance state.
So the inner NFA must be traversed once. */
nfa_t nfa_arg = nfa_compile_regex(first(args));
nfa_state_t *acc = nfa_state_accept();
nfa_state_t *s = nfa_state_empty(nfa_arg.start, 0); /* <-- diff */
nfa_state_empty_convert(nfa_arg.accept, nfa_arg.start, acc);
return nfa_make(s, acc);
} else if (sym == optional_s) {
/* In this case, we can keep the acceptance state of the inner
NFA as the acceptance state of the new NFA. We simply add
a new start state which can short-circuit to it via an empty
transition. */
nfa_t nfa_arg = nfa_compile_regex(first(args));
nfa_state_t *s = nfa_state_empty(nfa_arg.start, nfa_arg.accept);
return nfa_make(s, nfa_arg.accept);
} else if (sym == or_s) {
/* Simple: make a new start and acceptance state, which form
the ends of a spindle that goes through two branches. */
nfa_t nfa_first = nfa_compile_regex(first(args));
nfa_t nfa_second = nfa_compile_regex(second(args));
nfa_state_t *acc = nfa_state_accept();
/* New state s has empty transitions into each inner NFA. */
nfa_state_t *s = nfa_state_empty(nfa_first.start, nfa_second.start);
/* Acceptance state of each inner NFA converted to empty
transition to new combined acceptance state. */
nfa_state_empty_convert(nfa_first.accept, acc, 0);
nfa_state_empty_convert(nfa_second.accept, acc, 0);
return nfa_make(s, acc);
} else {
internal_error("bad operator in regex");
}
}
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.