简体   繁体   English

Solr 4.10-建议程序不适用于多值字段

[英]Solr 4.10 - Suggester is not working with multi-valued field

Hello everyone i am using solr 4.10 and i am not getting the result as per my expectation. 大家好,我使用的是solr 4.10,但我没有得到期望的结果。 i want to get auto complete suggestion using multiple fields that is discountCatName,discountSubName and vendorName. 我想使用多个字段(即DiscountCatName,discountSubName和vendorName)获得自动完成建议。 i have a created multi-valued field "suggestions" using copyfield and using that filed for searching in suggester configuration. 我有一个使用copyfield创建的多值字段“建议”,并使用该字段在建议程序配置中进行搜索。

Note: discountSubName & discountCatName are again multi-valued field, vendorName is string. 注意:DiscountSubName和DiscountCatName还是多值字段,vendorName是字符串。 This is a suggestion field data from one of my document: 这是来自我的一个文档的建议字段数据:

"suggestions": [ 
  "Budget Car Rental", 
  "Car Rentals", 
  "Business Deals", 
  "Auto", 
  "Travel", 
  "Car Rentals" ]

If i type for a "car" i am getting "Budget Car Rental" in my suggestion but not "Car Rentals", below are my configurations. 如果我输入的是“汽车”,则我的建议是“预算汽车租赁”,而不是“汽车租赁”,以下是我的配置。 let me know if i need to change the tokenizer and filters.Any help in this would be appreciate. 让我知道是否需要更改令牌生成器和过滤器。在此方面的任何帮助将不胜感激。

Below is my code block as per explained the scenario above. 下面是我的代码块,根据上面的情况进行了说明。

Suggestion field,fieldType,searchComponent and request handler respectively which i am using for auto complete suggestions 我分别用于自动完成建议的建议字段,fieldType,searchComponent和请求处理程序

<!--suggestion field -->
<field name="suggestions" type="suggestType" indexed="true" stored="true" multiValued="true"/>
<copyField source="discountCatName" dest="suggestions"/>
<copyField source="discountSubName" dest="suggestions"/>
<copyField source="vendorName" dest="suggestions"/>
<!--suggest fieldType -->

<fieldType name="suggestType" class="solr.TextField" positionIncrementGap="100">
  <analyzer>
    <charFilter class="solr.PatternReplaceCharFilterFactory" pattern="[^a-zA-Z0-9]" replacement=" " />
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.StandardFilterFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
    <filter class="solr.RemoveDuplicatesTokenFilterFactory" />
  </analyzer>
</fieldType>

<!--suggest searchComponent configuration -->
<searchComponent name="suggest" class="solr.SuggestComponent">
  <lst name="suggester">
    <str name="name">analyzing</str>
    <str name="lookupImpl">BlendedInfixLookupFactory</str>
    <str name="suggestAnalyzerFieldType">suggestType</str>
    <str name="blenderType">linear</str>
    <str name="minPrefixChars">1</str>
    <str name="doHighlight">false</str>
    <str name="weightField">score</str>
    <str name="dictionaryImpl">DocumentDictionaryFactory</str>
    <str name="field">suggestions</str>
    <str name="buildOnStartup">true</str>
    <str name="buildOnCommit">true</str>
  </lst>
</searchComponent>

<!--suggest request handler -->
<requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy" >
  <lst name="defaults">
    <str name="suggest">true</str>
    <str name="suggest.count">10</str>
    <str name="suggest.dictionary">analyzing</str>
  </lst>
  <arr name="components">
    <str>suggest</str>
  </arr>
</requestHandler>

I just discovered by debugging Solr 4.10 source code there is a bug in DocumentDictionaryFactory lookup, it's always look in the first string incase of multi-valued field and then stop suggestion from that document hence i am not getting expected output from my above configuration. 我只是通过调试Solr 4.10源代码发现在DocumentDictionaryFactory查找中存在一个错误,在多值字段的情况下,它总是在第一个字符串中查找,然后停止从该文档中提出建议,因此我无法从上述配置中获得预期的输出。

I have a created a separate index for all the fields i want to apply search like catName0...catName10, subName0...subName10 and then created multiple suggestion dictionaries for each fields and lastly i parsed the response form all the suggestion dictionary merged them and sorted based on weight and highlight position. 我为要应用搜索的所有字段创建了一个单独的索引,例如catName0 ... catName10,subName0 ... subName10,然后为每个字段创建了多个建议字典,最后我解析了所有建议字典将它们合并的响应形式并根据体重和突出位置进行排序。

Lengthy approach but no other way as this solr 4.10 was required. 冗长的方法,但没有其他方法,因为需要此Solr 4.10。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM