简体   繁体   English

SOLR还突出显示其他单词,但搜索查询除外

[英]SOLR highlighting other words also, other than search query

I have integrated the SOLR with Rails application by using the Sunspot gem. 我已经使用Sunspot gem将SOLR与Rails应用程序集成在一起。 And I found that sometimes the SOLR highlighting the other words. 我发现有时候SOLR会突出显示其他字眼。 If I search capital its highlighting capital borrowed , capital byputting words. 如果我搜索capital其突出显示的capital borrowedcapital byputting I don't know why its highlighting the borrowed or byputting words. 我不知道为什么它突出了borrowed or byputting单词。

My solrconfig.xml contains the following configutration 我的solrconfig.xml包含以下配置

<searchComponent class="solr.HighlightComponent" name="highlight">
<highlighting>
  <!-- Configure the standard fragmenter -->
  <!-- This could most likely be commented out in the "default" case -->
  <fragmenter name="gap" 
              default="true"
              class="solr.highlight.GapFragmenter">
    <lst name="defaults">
      <int name="hl.fragsize">100</int>
    </lst>
  </fragmenter>

  <!-- A regular-expression-based fragmenter 
       (for sentence extraction) 
    -->
  <fragmenter name="regex" 
              class="solr.highlight.RegexFragmenter">
    <lst name="defaults">
      <!-- slightly smaller fragsizes work better because of slop -->
      <int name="hl.fragsize">70</int>
      <!-- allow 50% slop on fragment sizes -->
      <float name="hl.regex.slop">0.5</float>
      <!-- a basic sentence pattern -->
      <str name="hl.regex.pattern">[-\w ,/\n\&quot;&apos;]{20,200}</str>
    </lst>
  </fragmenter>

  <!-- Configure the standard formatter -->
  <formatter name="html" 
             default="true"
             class="solr.highlight.HtmlFormatter">
    <lst name="defaults">
      <str name="hl.simple.pre"><![CDATA[<em>]]></str>
      <str name="hl.simple.post"><![CDATA[</em>]]></str>
    </lst>
  </formatter>

  <!-- Configure the standard encoder -->
  <encoder name="html" 
           class="solr.highlight.HtmlEncoder" />

  <!-- Configure the standard fragListBuilder -->
  <fragListBuilder name="simple" 
                   class="solr.highlight.SimpleFragListBuilder"/>

  <!-- Configure the single fragListBuilder -->
  <fragListBuilder name="single" 
                   class="solr.highlight.SingleFragListBuilder"/>

  <!-- Configure the weighted fragListBuilder -->
  <fragListBuilder name="weighted" 
                   default="true"
                   class="solr.highlight.WeightedFragListBuilder"/>

  <!-- default tag FragmentsBuilder -->
  <fragmentsBuilder name="default" 
                    default="true"
                    class="solr.highlight.ScoreOrderFragmentsBuilder">
    <!-- 
    <lst name="defaults">
      <str name="hl.multiValuedSeparatorChar">/</str>
    </lst>
    -->
  </fragmentsBuilder>

  <!-- multi-colored tag FragmentsBuilder -->
  <fragmentsBuilder name="colored" 
                    class="solr.highlight.ScoreOrderFragmentsBuilder">
    <lst name="defaults">
      <str name="hl.tag.pre"><![CDATA[
           <b style="background:yellow">,<b style="background:lawgreen">,
           <b style="background:aquamarine">,<b style="background:magenta">,
           <b style="background:palegreen">,<b style="background:coral">,
           <b style="background:wheat">,<b style="background:khaki">,
           <b style="background:lime">,<b style="background:deepskyblue">]]></str>
      <str name="hl.tag.post"><![CDATA[</b>]]></str>
    </lst>
  </fragmentsBuilder>

  <boundaryScanner name="default" 
                   default="true"
                   class="solr.highlight.SimpleBoundaryScanner">
    <lst name="defaults">
      <str name="hl.bs.maxScan">10</str>
      <str name="hl.bs.chars">.,!? &#9;&#10;&#13;</str>
    </lst>
  </boundaryScanner>

  <boundaryScanner name="breakIterator" 
                   class="solr.highlight.BreakIteratorBoundaryScanner">
    <lst name="defaults">
      <!-- type should be one of CHARACTER, WORD(default), LINE and SENTENCE -->
      <str name="hl.bs.type">WORD</str>
      <!-- language and country are used when constructing Locale object.  -->
      <!-- And the Locale object will be used when getting instance of BreakIterator -->
      <str name="hl.bs.language">en</str>
      <str name="hl.bs.country">US</str>
    </lst>
  </boundaryScanner>
</highlighting>

I have a content in html form and I am doing the highlighting on that content. 我有一个html格式的内容,并且正在对该内容进行突出显示。 Is anyone know the solution, please help me to figure the problem and its solution. 有谁知道解决方案,请帮助我解决问题及其解决方案。

It's due to the use of solr synonyms. 这是由于使用了solr同义词。 You probably have a file called synonyms.txt in your conf folder. 您可能在conf文件夹中有一个名为同义词。txt的文件。 For more info: https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory 有关更多信息: https : //wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM