简体   繁体   中英

NLTK CFG recursion depth error

import nltk   
from nltk.parse.generate import generate,demo_grammar   
from nltk import CFG   
grammar = CFG.fromstring("""   
ROOT -> S
S -> NP VP
NP -> NP PP
NP -> DT NN
DT -> 'The'
NN -> 'work'
PP -> IN NP
IN -> 'of'
NP -> DT NN
DT -> 'the'
NN -> 'painter'
VP -> VBZ ADJP
VBZ -> 'is'
ADJP -> JJ
JJ -> 'good'
""")    
print(grammar)   
for sentence in generate(grammar, n=100):   
   print(' '.join(sentence))

Gives an error

RuntimeError: maximum recursion depth exceeded while calling a Python object

Tried changing covert function in functools.py, still the same issue.

The function generate , as its docstring states, "Generates an iterator of all sentences from a CFG." Clearly it does so by choosing alternative expansions in the order they are listed in the grammar. So, the first time is sees an NP , it expands it with the rule NP -> NP PP . It now has another NP to expand, which it also expands with the same rule... and so on ad infinitum, or rather until python's limits are exceeded.

To fix the problem with the grammar you provide, simply reorder your first two NP rules so that the recursive rule is not the first one encountered:

grammar = CFG.fromstring("""   
ROOT -> S
S -> NP VP
NP -> DT NN
NP -> NP PP
DT -> 'The'
...
""")

Do it like this and the generator will produce lots of complete sentences for you to examine. Note that the corrected grammar is still recursive, hence infinite; if you generate a large enough number of sentences, you will eventually reach the same recursion depth limit.

I tried to number the repeating occurrence of NP NN DT etc. It seems to solve the problem due to unique identification( I presume). What wonders me is it should have been like this at first place ie the tree production thrown out should have serialized the parts of speech.

import nltk   
from nltk.parse.generate import generate,demo_grammar   
from nltk import CFG   
grammar = CFG.fromstring("""
ROOT -> S   
S -> NP VP
NP -> NP1 PP
NP1 -> DT1 NN1
DT1 -> 'The'
NN1 -> 'work'
PP -> IN NP2
IN -> 'of'
NP2 -> DT2 NN2
DT2 -> 'the'
NN2 -> 'painter'
VP -> VBZ ADJP
VBZ -> 'is'
ADJP -> JJ
JJ -> 'good'
""")    
print(grammar)   
for sentence in generate(grammar, n=100):   
   print(' '.join(sentence))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM