简体   繁体   中英

NLTK - generate text from probabilistic context free grammar (PCFG)

I have a context free grammar and use it to create sentences (using NLTK in python).

# Create a CFG
from nltk import CFG
from nltk.parse.generate import generate
grammar = CFG.fromstring("""
Story -> Introduction MainQuest End
LocationInfo -> 'He found himself in a small village where he grew up.'
Introduction -> 'Long ago there was a boy who decided to become a knight.'

MainQuest -> LocationInfo 'He had to get a sword first to fight monsters' Navigate

Navigate -> '[He could go west]' GoodEnd | '[He could go east]' BadEnd
GoodEnd -> 'And he lived happily ever after.'
BadEnd -> 'Finally he died painfully.'
End -> 'The End'
""")

#print(grammar.start())
#print(grammar.productions())
for sentence in generate(grammar, n=2):
    print('\n'.join(sentence))
    print('\n')

This is easy and works. But now, I'd like to add probabilities to special cases so that my generated story can either have a good or a bad ending, based on a random factor with given probabilities.

I can not find any example to do so and when I feed my PCFG into nltk.parse.generate it treats it like a CFG.

Hope you can help me out!

nltk.parse.generate.generate does not produce random sentences. It returns an iterator which produces each possible sentence exactly once until the requested number of sentences are generated. The maximum derivation depth can be restricted, but the generation is depth-first; it does not order the sentences by derivation depth.

You can find the source code here ; it's not difficult to see what it is doing.

So it is entirely deterministic, and never repeats itself. If you want a (potentially infinite) stream of randomly selected sentences, you will have to write your own generator.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM