繁体   English   中英

训练分类模型Opennlp

[英]Training a categorizer model Opennlp

我试图用下面的代码训练一个模型,但是我在DocumentCategorizerME.train()方法上不断遇到错误,它告诉我将factory更改为doccatfactory 为什么?

public void trainModel() 
{
    DoccatModel model = null;
    InputStream dataIn = null;

    try
    {
        InputStreamFactory factory = getInputStreamFactory(new File("D:/training.txt"));
        ObjectStream<String> lineStream = new PlainTextByLineStream(factory, Charset.defaultCharset());
        ObjectStream<DocumentSample> sampleStream = new DocumentSampleStream(lineStream);
        TrainingParameters params = new TrainingParameters();
        params.put(TrainingParameters.ITERATIONS_PARAM, "100");
        params.put(TrainingParameters.CUTOFF_PARAM, "0");

        model = DocumentCategorizerME.train("en", sampleStream, params, factory);

    }



}

public static InputStreamFactory getInputStreamFactory(final File file) throws IOException{
    return new InputStreamFactory() {

        @Override
        public InputStream createInputStream() throws IOException {
            return new FileInputStream(file);
        }
    };
}

当您使用DocumentCategorizerME.train(...)方法时,您需要传入DoccatFactory而不是InputStreamFactory。 尝试:

  model = DocumentCategorizerME.train("en", sampleStream, params, new DoccatFactory());

希望能帮助到你。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM