I'm trying to make a context-free-grammar to represent simple regular expressions. The symbols that I want is [0-9][az][AZ], and operators is "|", "()" and "." for concatenation, and for sequences for now I only want "*" later I will add "+","?", etc. I tried this grammar in javacc:
void RE(): {}
{
FINAL(0) ( "." FINAL(0) | "|" FINAL(0))*
}
void FINAL(int sign): { Token t; }
{
t = <SYMBOL> {
if ( sign == 1 )
jjtThis.val = t.image + "*";
else
jjtThis.val = t.image;
}
| FINAL(1) "*"
| "(" RE() ")"
}
The problem is in FINAL function the line | FINAL(1) "*"
| FINAL(1) "*"
that gives me a error Left recursion detected: "FINAL... --> FINAL...
. Putting "*" on the left of FINAL(1) resolve the problem but this is not what I want..
I already tried to read the article from wikipedia to remove left recursion but I really don't know how to do it, can someone help? :s
The following takes care of the left recursion
RE --> FACTOR ("." FINAL | "|" FINAL)*
FINAL --> PRIMARY ( "*" )*
PRIMARY --> <SYMBOL> | "(" RE ")"
However, that won't give . precedence over | . For that you can do the following
RE --> TERM ("|" TERM)*
TERM --> FINAL ("." FINAL)*
FINAL --> PRIMARY ( "*" )*
PRIMARY --> <SYMBOL> | "(" RE ")"
The general rule is
A --> A b | c | d | ...
can be transformed to
A --> B b*
B --> c | d | ...
where B is a new nonnterminal.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.