简体   繁体   中英

LR(0)/SLR/LR(1) parsing - how a production is chosen?

I'm trying to wrap my head around parser theory, and I keep finding the same example in different sources. The grammar goes approximately like this (simplified):

E = T
E = E + T
T = 0..9

So supposedly a string 2 + 2 will be parsed as such ("|" separates the stack from the reminder)

|2 + 2 <-can't reduce, shift
2|+ 2  <-reduce by T = 0..9
T|+ 2  <-reduce by E = T
E|+ 2  <-can't reduce, shift
E +|2  <-can't reduce, shift
E + 2| <-reduce by T = 0..9
E + T| <-reduction by E = E + T here?
E|     <-done

The question is, at E + T step parser can apply two different reductions to the rightmost part of the stack: E = T (resulting in E + E ) and E = E + T (resulting in E ). And I can't find a clear and conscise explanation how it chooses one over the other.

What am I missing?

What are the possible states?

0: Beginning
1: Just shifted 0..9 after State 0, recognize a T
2: Reduce State 1 to an E.
3: Just shifted + after State 2 or 5, looking for T
4: Just shifted 0..9 after State 3, recognize a T giving us E + T.
5: Reduce state 4 to an E
6: Reach the end of the stack after state 2 or 5.

So we start in state 0. Shift a 2 . We are now in state 1. Transition to state 2. Shift a + . We are now in state 3. We shift a 2 . We are in state 4. We reduce to state 5. We reach the end of the stack and wind up with an expression tree looking like the following:

  E
  |
E + T
|   |
T   2
|
2

According to the grammar, an E can never follow a + . This rules out the E = T production at this state.

To fully understand that, construct the parser tables by hand - the example is small enough to make this feasible.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM