简体   繁体   中英

How can I save the Open NLP parser output from Java, so that I can use it in Python?

How can I save the Open NLP parser output from Java, so that I can use it in Python?

I need to use the parse trees from Open NLP to perform some machine learning tasks in Python. The OpenNLP is in Java and I'm not sure how to save the data, so that I can use it through lists or a tree in Python.

Well, I think you'll have to use the show(StringBuffer) method in the parse object, and then write that to a file using something like a FileWriter in Java. From there you can pick it up in Python.

Something like this should do it (untested)

import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import opennlp.tools.parser.Parse;

/**
 *
 * @author mgiaconia
 */
public class ParseWriter {

  public static void main(String[] args) {
    String filePath = args[0];

    try (FileWriter outputFileWriter = new FileWriter(new File(args[0]))) {
      ///this string taken from the Parse's unit tests in the OpenNLP  source code
      Parse p1 = Parse.parseParse("(TOP  (S-CLF (NP-SBJ (PRP It)  )(VP (VBD was) "
          + " (NP-PRD (NP (DT the)  (NN trial)  )(PP (IN of) "
          + " (NP (NP (NN oleomargarine)  (NN heir)  )(NP (NNP Minot) "
          + " (PRN (-LRB- -LRB-) (NNP Mickey) "
          + " (-RRB- -RRB-) )(NNP Jelke)  )))(PP (IN for) "
          + " (NP (JJ compulsory)  (NN prostitution) "
          + " ))(PP-LOC (IN in)  (NP (NNP New)  (NNP York) "
          + " )))(SBAR (WHNP-1 (WDT that)  )(S (VP (VBD put) "
          + " (NP (DT the)  (NN spotlight)  )(PP (IN on)  (NP (DT the) "
          + " (JJ international)  (NN play-girl)  ))))))(. .)  ))");

      StringBuffer parseString = new StringBuffer();
      //pass this referece into the show method
      p1.show(parseString);
      outputFileWriter.write(parseString.toString());
      outputFileWriter.flush();

    } catch (IOException ex) {
      ex.printStackTrace();
    }
  }

}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM