简体   繁体   中英

How can I implement simple CFG parser using PHP preg_match_all?

I am using preg_match_all to make a simple parser. Note that since it will parse only few sentences, the performance does not matter. Would it be possible to make a parser which parse through below Context free grammer?

S -> NP VP
PP -> P NP
NP -> 'the' N | N PP | 'the' N PP
VP -> V NP | V PP | V NP PP
N -> 'cat'
N -> 'dog'
N -> 'rug'
V -> 'chased'
V -> 'sat'
P -> 'in'
P -> 'on'

The problem here that I couldn't resolve was loop.

For example, do you see loop where there can be PP -> NP -> PP and so on?

Is there anything in PHP that works like Push-down automata that can solve this problem?

Example input: 'the cat chased the dog'

Example output:

(S (NP the (N cat)) (VP (V chased) (NP the (N dog))))

Example input: 'the cat chased the dog on the rug'

Example output(s):

(S (NP the (N cat)) (VP (V chased) (NP the (N dog) (PP (P on) (NP the (N rug))))))

(S (NP the (N cat)) (VP (V chased) (NP the (N dog)) (PP (P on) (NP the (N rug)))))

The usual approach here is to write a predictive parser. For you, this could mean using regular expressions to match either a noun, verb or predicate and then deciding what production to use. You are correct that parsing a grammar requires the computational power of a push-down automata (ie more than what a regular expression alone can achieve). Simulating a push-down automaton is one approach and is what parser generators like yacc/bison often do. For a small grammar like that though, you can use the call stack implicitly.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM