简体   繁体   中英

How to get lemmas from sentences in DKPro/UIMA?

I'm trying to setup a pipeline, which produces lemmatized sentences. I know how to get either all sentences or all lemmas, but I don't know how to get collections of lemmas divided by sentence ends. Here is a code snippet with a missing argument marked by ?????? :

AnalysisEngine pipeline = createEngine(createEngineDescription( 
                              createEngineDescription(BreakIteratorSegmenter.class),
                              createEngineDescription(StanfordLemmatizer.class),
                              createEngineDescription(StopWordRemover.class, StopWordRemover.PARAM_MODEL_LOCATION,
                                  new String[]{"stopwords.txt"})));

JCas jcas = JCasFactory.createJCas();

jcas.setDocumentText    ("Almost all energy on Earth comes from the Sun. Plants make food energy from sunlight.");
jcas.setDocumentLanguage("en");
pipeline.process        (jcas);

for (Sentence s : select(jcas, Sentence.class)) {
  out.println("");

  for (Lemma l : select(??????, Lemma.class)) 
    out.print(l.getValue() + " ");
}

What do I need to change in this code, so it prints lemmas from two input sentences in two lines.

Here you go:

for (Lemma l : JCasUtil.selectCovered(Lemma.class, s)) 
    out.print(l.getValue() + " ");

Disclosure: I am working on the Apache UIMA project

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM