I'm trying to remove the indirect recursion from the following grammar:
F -> P X
X -> R C
R -> C R | I R | epsilon
All capital letters are non-terminals, I have just left off the productions as they aren't important.
Following this through, you can see that I will get something that looks like F -> P (C | I)* C which is causing issues for my recursive descent parser.
Any valid expression must end in C but this final C is always consumed by repeatedly using the R production leaving no tokens left for final C of the X -> RC production.
Take the following expression of tokens as an example: "PCIC"
Now we hit the problem. A recursive decent parser will simply choose to continue expanding the R like so:
At this point, my program will error despite the valid input because the final RC -> C (using the epsilon production) and there is no remaining C to match!
I think all I need to do is cleverly rearrange the grammar in some way to remove this ambiguity. Would appreciate any help!
You can rewrite P (C | I) * C
as PI * C + (I + C + ) *
, which is straightforward to implement as a recursive-descent parser. If you write it out as a right-recursive BNF grammar, you'll end up with incorrect associativity, but normally you would write the recursive descent parser with loops, using the extended BNF as above. [Note 1]
In effect, once you parse the P
, you end up with something like this:
do {
while (try_parse(I)) {}
parse(C);
while (try_parse(C)) {}
} while (try_parse(I));
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.