如何在使用简单的CoreNLP API时设置tokenizer选项？

Question

I'm aware of the tokenizer options that are available in CoreNLP and I know how to set them in the standard version. 我知道CoreNLP中提供的tokenizer选项，我知道如何在标准版本中设置它们。

Is there way to pass the options, eg the untokenizable=noneKeep , when using the Simple CoreNLP interfaces? 有没有办法在使用Simple CoreNLP接口时传递选项，例如untokenizable=noneKeep ？

Answer 1

You can build a Document with properties. 您可以使用属性构建文档。

package edu.stanford.nlp.examples;

import edu.stanford.nlp.simple.*;

import java.util.*;

public class SimpleExample {

    public static void main(String[] args) {
        Properties props = new Properties();
        props.setProperty("tokenize.options", "untokenizable=allKeep");
        Document doc = new Document(props, "Joe Smith was born in California.  He moved to Chicago last year.");
        for (Sentence sent : doc.sentences()) {
            System.out.println(sent.tokens());
            System.out.println(sent.nerTags());
            System.out.println(sent.parse());
        }
    }

}

如何在使用简单的CoreNLP API时设置tokenizer选项？

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-01-08 08:38:04

如何在使用简单的CoreNLP API时设置tokenizer选项？

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-01-08 08:38:04

解决方案1
1 已采纳 2019-01-08 08:38:04