简体   繁体   English

Solr-通配符搜索随提取方法的不同而不同

[英]Solr - Wild Card Search varies with Stemming Methods

I have 2 versions of solr working in my machine . 我的机器上有2个版本的solr。 say SolrVer1 and SolrVer2 SolrVer1SolrVer2

SolrVer1 have applied , below stemming methods on field type text_en_splitting SolrVer1在字段类型text_en_splitting方法下面应用了

<filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt" ignoreCase="true"/>
 <filter class="solr.PorterStemFilterFactory" ignoreCase="true"/>

SolrVer2 have applied , below stemming methods on field type text_en_splitting SolrVer2在字段类型text_en_splitting方法下面应用了

<filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>

it works almost same for regular search , but while using wild card search then wild card search does not giving results with grammatical on SolrVer1 对于常规搜索,它的工作原理几乎相同,但是在使用通配符搜索时,通配符搜索不会在SolrVer1上提供语法结果

like searching with ray* , SolrVer1 returns very less data as compared to SolrVer2 . 与使用ray*搜索一样,与SolrVer2相比, SolrVer1返回的数据要SolrVer2 when i observed the results then i found that SolrVer1 does not return data with only ray and rays . 当我观察结果时,我发现SolrVer1不会仅返回rayrays返回数据。

I don't know where i should use SnowballPorterFilterFactory and where i should use PorterStemFilterFactory . 我不知道我应该在哪里使用SnowballPorterFilterFactory以及我应该在哪里使用PorterStemFilterFactory and what are the pros and cons of them? 它们的优缺点是什么?

Can anybody have idea on this behavior ?? 有人能对此行为有想法吗?

Thanks 谢谢

Need to know what the stemmers output for ray , rays . 需要知道的词干输出rayrays

Try stemming them at the Porter stemmer online tool: http://qaa.ath.cx/porter_js_demo.html . 尝试使用Porter stemmer在线工具阻止它们: http : //qaa.ath.cx/porter_js_demo.html It outputs rai ! 它输出rai That's the reason you don't get any matches for ray* with Porter stemmer. 这就是为什么您无法使用Porter stemmer获得ray*任何匹配项的原因。

And here is a tool for snowball stemmer: http://snowball.tartarus.org/demo.php . 这是雪球阻止程序的工具: http : //snowball.tartarus.org/demo.php This outputs ray for ray and rays which is why you get the results. 这会为rayrays输出ray ,这就是为什么要得到结果的原因。

You may want to read this for comparing the two stemmers: http://snowball.tartarus.org/texts/introduction.html 您可能需要阅读以下内容以比较这两个词干: http : //snowball.tartarus.org/texts/introduction.html

Appears like snowball was designed to address such short-comings of Porter. 出现像滚雪球一样的目的是为了解决波特的这些缺点。

Analyzers 分析仪

On wildcard and fuzzy searches, no text analysis is performed on the search word.

As no analysis is done at query time for wilcard searches and hence the stemmers would be applied during query time. 由于在查询时不会对通配符搜索进行任何分析,因此将在查询期间应用词干提取器。
The results would be different depending upon what the stemmers are producing. 结果将取决于茎杆产生什么。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM