简体   繁体   中英

How to display NLTK parse tree in an HTML page?

I am creating a sentence parsing application using python and django.I am getting the parse tree of the sentence as below.

>>> sentence = """At eight o'clock on Thursday morning
... Arthur didn't feel very good."""

>>> tokens = nltk.word_tokenize(sentence)
>>> tokens
['At', 'eight', "o'clock", 'on', 'Thursday', 'morning',
'Arthur', 'did', "n't", 'feel', 'very', 'good', '.']

>>> tagged = nltk.pos_tag(tokens)

>>> tagged[0:6]
[('At', 'IN'), ('eight', 'CD'), ("o'clock", 'JJ'), ('on', 'IN'),
('Thursday', 'NNP'), ('morning', 'NN')]
>>> entities = nltk.chunk.ne_chunk(tagged)
>>> entities
Tree('S', [('At', 'IN'), ('eight', 'CD'), ("o'clock", 'JJ'),
       ('on', 'IN'), ('Thursday', 'NNP'), ('morning', 'NN'),
   Tree('PERSON', [('Arthur', 'NNP')]),
       ('did', 'VBD'), ("n't", 'RB'), ('feel', 'VB'),
       ('very', 'RB'), ('good', 'JJ'), ('.', '.')])

when I call tree.draw() a parse tree with images is shown.I want to show it in a web page.How can i do this?

In order to display an NLTK parse tree in an HTML page you can use the Constituent-Treelib library, which builds on top of benepar, spaCy and NLTK. First, install the library via: pip install constituent-treelib

Then, perform the following steps:

from constituent_treelib import ConstituentTree, BracketedTree

# Define your sentence that should be parsed and visualized
sentence = "At eight o'clock on Thursday morning Arthur didn't feel very good."

# Define the language that should be considered with respect to the underlying benepar and spaCy models 
language = ConstituentTree.Language.English

# You can also specify the desired model for the language ("Small" is selected by default)
spacy_model_size = ConstituentTree.SpacyModelSize.Large

# Create the neccesary NLP pipeline (required to instantiate a ConstituentTree object)
nlp = ConstituentTree.create_pipeline(language, spacy_model_size) 

# If you wish, you can instruct the library to download and install the models automatically
# nlp = ConstituentTree.create_pipeline(language, spacy_model_size, download_models=True) 

# Instantiate a ConstituentTree object and pass it the sentence as well as the NLP pipeline
tree = ConstituentTree(sentence, nlp)

# Now you can export the tree into an SVG file (or other formats) and display it in an HTML page 
tree.export_tree("NLTK_parse_tree.svg")

Result...

 <svg baseProfile="full" height="264px" preserveAspectRatio="xMidYMid meet" style="font-family: times, serif; font-weight:normal; font-style: normal; font-size: 16px;" version="1.1" viewBox="0,0,656.0,264.0" width="656px" xmlns="http://www.w3.org/2000/svg" xmlns:ev="http://www.w3.org/2001/xml-events" xmlns:xlink="http://www.w3.org/1999/xlink"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">S</text></svg><svg width="24.3902%" x="0%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">PP</text></svg><svg width="20%" x="0%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">IN</text></svg><svg width="100%" x="0%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">At</text></svg></svg><line stroke="black" x1="50%" x2="50%" y1="1.2em" y2="3em" /></svg><line stroke="black" x1="50%" x2="10%" y1="1.2em" y2="3em" /><svg width="80%" x="20%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">NP</text></svg><svg width="43.75%" x="0%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">CD</text></svg><svg width="100%" x="0%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">eight</text></svg></svg><line stroke="black" x1="50%" x2="50%" y1="1.2em" y2="3em" /></svg><line stroke="black" x1="50%" x2="21.875%" y1="1.2em" y2="3em" /><svg width="56.25%" x="43.75%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">NN</text></svg><svg width="100%" x="0%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">o'clock</text></svg></svg><line stroke="black" x1="50%" x2="50%" y1="1.2em" y2="3em" /></svg><line stroke="black" x1="50%" x2="71.875%" y1="1.2em" y2="3em" /></svg><line stroke="black" x1="50%" x2="60%" y1="1.2em" y2="3em" /></svg><line stroke="black" x1="50%" x2="12.1951%" y1="1.2em" y2="3em" /><svg width="28.0488%" x="24.3902%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">PP</text></svg><svg width="17.3913%" x="0%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">IN</text></svg><svg width="100%" x="0%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">on</text></svg></svg><line stroke="black" x1="50%" x2="50%" y1="1.2em" y2="3em" /></svg><line stroke="black" x1="50%" x2="8.69565%" y1="1.2em" y2="3em" /><svg width="82.6087%" x="17.3913%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">NP</text></svg><svg width="52.6316%" x="0%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">NNP</text></svg><svg width="100%" x="0%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">Thursday</text></svg></svg><line stroke="black" x1="50%" x2="50%" y1="1.2em" y2="3em" /></svg><line stroke="black" x1="50%" x2="26.3158%" y1="1.2em" y2="3em" /><svg width="47.3684%" x="52.6316%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">NN</text></svg><svg width="100%" x="0%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">morning</text></svg></svg><line stroke="black" x1="50%" x2="50%" y1="1.2em" y2="3em" /></svg><line stroke="black" x1="50%" x2="76.3158%" y1="1.2em" y2="3em" /></svg><line stroke="black" x1="50%" x2="58.6957%" y1="1.2em" y2="3em" /></svg><line stroke="black" x1="50%" x2="38.4146%" y1="1.2em" y2="3em" /><svg width="9.7561%" x="52.439%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">NP</text></svg><svg width="100%" x="0%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">NNP</text></svg><svg width="100%" x="0%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">Arthur</text></svg></svg><line stroke="black" x1="50%" x2="50%" y1="1.2em" y2="3em" /></svg><line stroke="black" x1="50%" x2="50%" y1="1.2em" y2="3em" /></svg><line stroke="black" x1="50%" x2="57.3171%" y1="1.2em" y2="3em" /><svg width="34.1463%" x="62.1951%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">VP</text></svg><svg width="17.8571%" x="0%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">VBD</text></svg><svg width="100%" x="0%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">did</text></svg></svg><line stroke="black" x1="50%" x2="50%" y1="1.2em" y2="3em" /></svg><line stroke="black" x1="50%" x2="8.92857%" y1="1.2em" y2="3em" /><svg width="17.8571%" x="17.8571%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">RB</text></svg><svg width="100%" x="0%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">n't</text></svg></svg><line stroke="black" x1="50%" x2="50%" y1="1.2em" y2="3em" /></svg><line stroke="black" x1="50%" x2="26.7857%" y1="1.2em" y2="3em" /><svg width="64.2857%" x="35.7143%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">VP</text></svg><svg width="33.3333%" x="0%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">VB</text></svg><svg width="100%" x="0%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">feel</text></svg></svg><line stroke="black" x1="50%" x2="50%" y1="1.2em" y2="3em" /></svg><line stroke="black" x1="50%" x2="16.6667%" y1="1.2em" y2="3em" /><svg width="66.6667%" x="33.3333%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">ADJP</text></svg><svg width="50%" x="0%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">RB</text></svg><svg width="100%" x="0%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">very</text></svg></svg><line stroke="black" x1="50%" x2="50%" y1="1.2em" y2="3em" /></svg><line stroke="black" x1="50%" x2="25%" y1="1.2em" y2="3em" /><svg width="50%" x="50%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">JJ</text></svg><svg width="100%" x="0%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">good</text></svg></svg><line stroke="black" x1="50%" x2="50%" y1="1.2em" y2="3em" /></svg><line stroke="black" x1="50%" x2="75%" y1="1.2em" y2="3em" /></svg><line stroke="black" x1="50%" x2="66.6667%" y1="1.2em" y2="3em" /></svg><line stroke="black" x1="50%" x2="67.8571%" y1="1.2em" y2="3em" /></svg><line stroke="black" x1="50%" x2="79.2683%" y1="1.2em" y2="3em" /><svg width="3.65854%" x="96.3415%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">.</text></svg><svg width="100%" x="0%" y="3em"><defs /><svg width="100%" x="0" y="0em"><defs /><text text-anchor="middle" x="50%" y="1em">.</text></svg></svg><line stroke="black" x1="50%" x2="50%" y1="1.2em" y2="3em" /></svg><line stroke="black" x1="50%" x2="98.1707%" y1="1.2em" y2="3em" /></svg>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM