how to get parse tree using python nltk?

Question

Given the following sentence:

The old oak tree from India fell down.

How can I get the following parse tree representation of the sentence using python NLTK?

(ROOT (S (NP (NP (DT The) (JJ old) (NN oak) (NN tree)) (PP (IN from) (NP (NNP India)))) (VP (VBD fell) (PRT (RP down)))))

I need a complete example which I couldn't find in web!

Edit

I have gone through this book chapter to learn about parsing using NLTK but the problem is, I need a grammar to parse sentences or phrases which I do not have. I have found this stackoverflow post which also asked about grammar for parsing but there is no convincing answer there.

So, I am looking for a complete answer that can give me the parse tree given a sentence.

Answer 1

Here is alternative solution using StanfordCoreNLP instead of nltk . There are few library that build on top of StanfordCoreNLP , I personally use pycorenlp to parse the sentence.

First you have to download stanford-corenlp-full folder where you have *.jar file inside. And run the server inside the folder (default port is 9000).

export CLASSPATH="`find . -name '*.jar'`"
java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer [port?] # run server

Then in Python, you can run the following in order to tag the sentence.

from pycorenlp import StanfordCoreNLP
nlp = StanfordCoreNLP('http://localhost:9000')

text = "The old oak tree from India fell down."

output = nlp.annotate(text, properties={
  'annotators': 'parse',
  'outputFormat': 'json'
})

print(output['sentences'][0]['parse']) # tagged output sentence

Answer 2

To get parse tree using nltk library you can use the following code

# Import required libraries
import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
from nltk import pos_tag, word_tokenize, RegexpParser

# Example text
sample_text = "The quick brown fox jumps over the lazy dog"

# Find all parts of speech in above sentence
tagged = pos_tag(word_tokenize(sample_text))

#Extract all parts of speech from any text
chunker = RegexpParser("""
                    NP: {<DT>?<JJ>*<NN>} #To extract Noun Phrases
                    P: {<IN>}            #To extract Prepositions
                    V: {<V.*>}           #To extract Verbs
                    PP: {<p> <NP>}       #To extract Prepositional Phrases
                    VP: {<V> <NP|PP>*}   #To extract Verb Phrases
                    """)

# Print all parts of speech in above sentence
output = chunker.parse(tagged)
print("After Extracting\n", output)

# output looks something like this
 (S
  (NP The/DT old/JJ oak/NN)
  (NP tree/NN)
  (P from/IN)
  India/NNP
  (VP (V fell/VBD))
  down/RB
  ./.)

You can also get a graph for this tree

# To draw the parse tree
output.draw()

Output graph looks like this

Answer 3

Older question, but you can use nltk together with the bllipparser . Here is a longer example from nltk . After some fiddling I myself used the following:

To install (with nltk already installed):

sudo python3 -m nltk.downloader bllip_wsj_no_aux
pip3 install bllipparser

To use:

from nltk.data import find
from bllipparser import RerankingParser

model_dir = find('models/bllip_wsj_no_aux').path
parser = RerankingParser.from_unified_model_dir(model_dir)

best = parser.parse("The old oak tree from India fell down.")

print(best.get_reranker_best())
print(best.get_parser_best())

Output:

-80.435259246021 -23.831876011253 (S1 (S (NP (NP (DT The) (JJ old) (NN oak) (NN tree)) (PP (IN from) (NP (NNP India)))) (VP (VBD fell) (PRT (RP down))) (. .)))
-79.703612178593 -24.505514522222 (S1 (S (NP (NP (DT The) (JJ old) (NN oak) (NN tree)) (PP (IN from) (NP (NNP India)))) (VP (VBD fell) (ADVP (RB down))) (. .)))

Answer 4

A comprehensive answer to the question of the OP can be found in this thread: https://stackoverflow.com/a/75122588/4293361

how to get parse tree using python nltk?

Question

4 answers

solution1
6 ACCPTED 2017-02-19 03:55:01

solution2
1 2022-03-16 06:12:34

solution3
0 2019-08-29 09:34:35

solution4
0 2023-01-16 10:05:35

how to get parse tree using python nltk?

Question

4 answers

solution1 6 ACCPTED 2017-02-19 03:55:01

solution2 1 2022-03-16 06:12:34

solution3 0 2019-08-29 09:34:35

solution4 0 2023-01-16 10:05:35

solution1
6 ACCPTED 2017-02-19 03:55:01

solution2
1 2022-03-16 06:12:34

solution3
0 2019-08-29 09:34:35

solution4
0 2023-01-16 10:05:35