简体   繁体   中英

Saving nltk drawn parse tree to image file

在此处输入图像描述

Is there any way to save the draw image from tree.draw() to an image file programmatically? I tried looking through the documentation, but I couldn't find anything.

Using the nltk.draw.tree.TreeView object to create the canvas frame automatically:

>>> from nltk.tree import Tree
>>> from nltk.draw.tree import TreeView
>>> t = Tree.fromstring('(S (NP this tree) (VP (V is) (AdjP pretty)))')
>>> TreeView(t)._cframe.print_to_file('output.ps')

Then:

>>> import os
>>> os.system('convert output.ps output.png')

[output.png]:

在此输入图像描述

I had exactly the same need, and looking into the source code of nltk.draw.tree I found a solution:

from nltk import Tree
from nltk.draw.util import CanvasFrame
from nltk.draw import TreeWidget

cf = CanvasFrame()
t = Tree.fromstring('(S (NP this tree) (VP (V is) (AdjP pretty)))')
tc = TreeWidget(cf.canvas(),t)
cf.add_widget(tc,10,10) # (10,10) offsets
cf.print_to_file('tree.ps')
cf.destroy()

The output file is a postscript, and you can convert it to an image file using ImageMagick on terminal:

$ convert tree.ps tree.png

I think this is a quick and dirty solution; it could be inefficient in that it displays the canvas and destroys it later (perhaps there is an option to disable display, which I couldn't find). Please let me know if there is any better way.

To add to Minjoon's answer, you can change the fonts and colours of the tree to look more like the NLTK .draw() version as follows:

tc['node_font'] = 'arial 14 bold'
tc['leaf_font'] = 'arial 14'
tc['node_color'] = '#005990'
tc['leaf_color'] = '#3F8F57'
tc['line_color'] = '#175252'

Before (left) and after (right):

之前 后

To save a given NLTK tree to an image file (OS-agnostic), I recommend the Constituent-Treelib library, which builds on benepar, spaCy and NLTK. First, install it via pip install constituent-treelib

Then, perform the following steps:

from nltk import Tree
from constituent_treelib import ConstituentTree

# Define your sentence that should be parsed and saved to a file
sentence = "At least nine tenths of the students passed."

# Rather than a raw string you can also provide an already constructed NLTK tree
sentence = Tree('S', [Tree('NP', [Tree('NP', [Tree('QP', [Tree('ADVP', [Tree('RB', ['At']), Tree('RBS', ['least'])]), Tree('CD', ['nine'])]), Tree('NNS', ['tenths'])]), Tree('PP', [Tree('IN', ['of']), Tree('NP', [Tree('DT', ['the']), Tree('NNS', ['students'])])])]), Tree('VP', [Tree('VBD', ['passed'])]), Tree('.', ['.'])])

# Define the language that should be considered with respect to the underlying benepar and spaCy models 
language = ConstituentTree.Language.English

# You can also specify the desired model for the language ("Small" is selected by default)
spacy_model_size = ConstituentTree.SpacyModelSize.Large

# Create the neccesary NLP pipeline (required to instantiate a ConstituentTree object)
nlp = ConstituentTree.create_pipeline(language, spacy_model_size) 

# In case you haven't downloaded the required benepar an spaCy models, you can tell the method to do it automatically for you
# nlp = ConstituentTree.create_pipeline(language, spacy_model_size, download_models=True) 

# Instantiate a ConstituentTree object and pass it the sentence as well as the NLP pipeline
tree = ConstituentTree(sentence, nlp)

# Now you can export the tree to a file (e.g., a PDF)  
tree.export_tree("NLTK_parse_tree.pdf", verbose=True)

>>> PDF-file successfully saved to: NLTK_parse_tree.pdf

Result... 在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM