Terrier IR system uses Porter Stemmer by default. How can we use statistical stemmers output in Terrier? i have generated the stem list using statistical stemmer and want to embed it in TERRIER IR.
You have to create a class extending StemmerTermPipeline into the org.terrier.terms package.
For instance:
public class StatisticalStemmer extends StemmerTermPipeline {
public StatisticalStemmer(TermPipeline next) {
super(next);
}
@Override
public String stem(String word) {
// your method implementation
}
}
Afterwards, you need to recompile the core component and to substitute the terrier-4.0-core.jar file in the lib directory.
Lastly, you need to update the Term Pipeline in the property file:
termpipelines=Stopwords,StatisticalStemmer
In this way, Terrier will use you stemmer in place of the PorterStemmer.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.