[英]Solr - Wild Card Search varies with Stemming Methods
I have 2 versions of solr working in my machine . 我的机器上有2个版本的solr。 say
SolrVer1
and SolrVer2
说
SolrVer1
和SolrVer2
SolrVer1
have applied , below stemming methods on field type text_en_splitting
SolrVer1
在字段类型text_en_splitting
方法下面应用了
<filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt" ignoreCase="true"/>
<filter class="solr.PorterStemFilterFactory" ignoreCase="true"/>
SolrVer2
have applied , below stemming methods on field type text_en_splitting
SolrVer2
在字段类型text_en_splitting
方法下面应用了
<filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
it works almost same for regular search , but while using wild card search then wild card search does not giving results with grammatical on SolrVer1
对于常规搜索,它的工作原理几乎相同,但是在使用通配符搜索时,通配符搜索不会在
SolrVer1
上提供语法结果
like searching with ray*
, SolrVer1
returns very less data as compared to SolrVer2
. 与使用
ray*
搜索一样,与SolrVer2
相比, SolrVer1
返回的数据要SolrVer2
。 when i observed the results then i found that SolrVer1
does not return data with only ray
and rays
. 当我观察结果时,我发现
SolrVer1
不会仅返回ray
和rays
返回数据。
I don't know where i should use SnowballPorterFilterFactory
and where i should use PorterStemFilterFactory
. 我不知道我应该在哪里使用
SnowballPorterFilterFactory
以及我应该在哪里使用PorterStemFilterFactory
。 and what are the pros and cons of them? 它们的优缺点是什么?
Can anybody have idea on this behavior ?? 有人能对此行为有想法吗?
Thanks 谢谢
Need to know what the stemmers output for ray
, rays
. 需要知道的词干输出
ray
, rays
。
Try stemming them at the Porter stemmer online tool: http://qaa.ath.cx/porter_js_demo.html . 尝试使用Porter stemmer在线工具阻止它们: http : //qaa.ath.cx/porter_js_demo.html 。 It outputs
rai
! 它输出
rai
! That's the reason you don't get any matches for ray*
with Porter stemmer. 这就是为什么您无法使用Porter stemmer获得
ray*
任何匹配项的原因。
And here is a tool for snowball stemmer: http://snowball.tartarus.org/demo.php . 这是雪球阻止程序的工具: http : //snowball.tartarus.org/demo.php 。 This outputs
ray
for ray
and rays
which is why you get the results. 这会为
ray
和rays
输出ray
,这就是为什么要得到结果的原因。
You may want to read this for comparing the two stemmers: http://snowball.tartarus.org/texts/introduction.html 您可能需要阅读以下内容以比较这两个词干: http : //snowball.tartarus.org/texts/introduction.html
Appears like snowball was designed to address such short-comings of Porter. 出现像滚雪球一样的目的是为了解决波特的这些缺点。
On wildcard and fuzzy searches, no text analysis is performed on the search word.
As no analysis is done at query time for wilcard searches and hence the stemmers would be applied during query time. 由于在查询时不会对通配符搜索进行任何分析,因此将在查询期间应用词干提取器。
The results would be different depending upon what the stemmers are producing. 结果将取决于茎杆产生什么。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.