简体   繁体   中英

How to convert from Tree type to String type in Python by nltk?

for subtree3 in tree.subtrees():
  if subtree3.label() == 'CLAUSE':
    print(subtree3)
    print subtree3.leaves()

Using this code I able to extract the leaves of the tree. Which are: [('talking', 'VBG'), ('constantly', 'RB')] for a certain example. That is perfectly correct. Now I want this Tree elements to convert into string or in list for some further processing. How can I do that?

What I tried

for subtree3 in tree.subtrees():
  if subtree3.label() == 'CLAUSE':
    print(subtree3)
    print subtree3.leaves()
    fo.write(subtree3.leaves())
fo.close()

But it throws an error :

Traceback (most recent call last):
  File "C:\Python27\Association_verb_adverb.py", line 35, in <module>
    fo.write(subtree3.leaves())
TypeError: expected a character buffer object

I just want to store the leaves in a text file.

It depends on your version of NLTK and Python. I think you're referencing the Tree class in the nltk.tree module. If so, read on.

In your code, it's true that:

  1. subtree3.leaves() returns a "list of tuple" object and,
  2. fo is a Python File IO object , the fo.write only receives a str type as a parameters

you can simply print the tree leaves with fo.write(str(subtree3.leaves())) , thus:

for subtree3 in tree.subtrees():
    if subtree3.label() == 'CLAUSE':
        print(subtree3)
        print subtree3.leaves()
        fo.write(str(subtree3.leaves()))
fo.flush()
fo.close()

and don't forget to flush() the buffer.

Possibly the question is more of trying to write a list of tuples to files instead of traversing the NLTK Tree object. See NLTK: How do I traverse a noun phrase to return list of strings? and Unpacking a list / tuple of pairs into two lists / tuples

To output a list of tuples of 2 strings, I find it useful to use this idiom:

fout = open('outputfile', 'w')

listoftuples = [('talking', 'VBG'), ('constantly', 'RB')]
words, tags = zip(*listoftuples)

fout.write(' '.join(words) + '\t' + ' '.join(tags) + '\n')

But the zip(*list) code might not work if there are multiple levels in your subtrees.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM