简体   繁体   English

Lucene-默认搜索词条化/词干

[英]Lucene - default search lemmatization/stemming

Does Lucene default search do lemmatization/stemming on the words? Lucene默认搜索是否对单词进行去词法化/词干化?

For example when using the code in this sample , are the words in the docs used as is or are they transformed to their basic form (ie Managing -> manag), and if so what default lemmatizer does it use? 例如,当使用本示例中的代码时,文档中的单词是否按原样使用,或者是否已转换为基本形式(即管理->管理),如果是,它将使用哪种默认lemmatizer?

The sample referred in your post uses Lucene StandardAnalyzer which does not do stemming. 您的帖子中引用的示例使用了Lucene StandardAnalyzer ,它没有词干。

If you want to use stemming, you need to use an other Analyzer implementation eg: SnowballAnalyzer 如果要使用词干,则需要使用其他Analyzer实现,例如: SnowballAnalyzer

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM