简体   繁体   English

Solr拼写检查多词短语

[英]Solr Spellcheck for Multi Word Phrases

I have a problem with solr spellcheck suggestions for multi word phrases. 我对多词短语的solr拼写检查建议有问题。 With the query for 'red chillies' 用“红辣椒”查询

q=red+chillies&wt=xml&indent=true&spellcheck=true&spellcheck.extendedResults=true&spellcheck.collate=true

I get 我懂了

<lst name="suggestions">
  <lst name="chillies">
    <int name="numFound">2</int>
    <int name="startOffset">4</int>
    <int name="endOffset">12</int>
    <int name="origFreq">0</int>
    <arr name="suggestion">
      <lst><str name="word">chiller</str><int name="freq">4</int></lst>
      <lst><str name="word">challis</str><int name="freq">2</int></lst>
    </arr>
  </lst>
  <bool name="correctlySpelled">false</bool>
  <str name="collation">red chiller</str>
</lst>

The problem is, even though 'chiller' has 4 results in index, 'red chiller' has none. 问题是,即使“ chiller”在索引中有4个结果,“ red chiller”也没有。 So we end up suggesting a phrase with 0 result. 因此,我们最终建议一个结果为0的短语。

What can I do to make spellcheck work on the whole phrase only? 如何使拼写检查仅对整个短语起作用? I tried using KeywordTokenizerFactory in query: 我尝试在查询中使用KeywordTokenizerFactory:

<fieldType name="text_spell" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <tokenizer class="solr.StandardTokenizerFactory" />
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
    <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> 
    <filter class="solr.LowerCaseFilterFactory" />
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.KeywordTokenizerFactory" />
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
    <filter class="solr.LowerCaseFilterFactory" />
  </analyzer>
</fieldType>

And I also tried adding 我也尝试添加

<str name="sp.query.extendedResults">false</str>

within

<lst name="spellchecker">

in solrconfig.xml. 在solrconfig.xml中。

But neither seems to make a difference. 但是,两者似乎都没有什么不同。

What would be the best way to make spellcheck only give collation that have results for the whole phrase? 使拼写检查仅给出对整个短语都有结果的排序规则的最佳方法是什么? Thanks! 谢谢!

The real issue here is that you need to specify the spellcheck.collateParam.q.op=AND and also (optionally) spellcheck.collateParam.mm=100% These params enforce the collate queries executed correctly. 真正的问题是,您需要指定spellcheck.collateParam.q.op=AND ,并且(可选)指定spellcheck.collateParam.mm=100%这些参数强制正确执行了整理查询。

You can read more about this on the solr docs 您可以在solr文档中阅读有关此内容的更多信息

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM