简体   繁体   English

如何在 elasticsearch 中评估精确匹配高于词频的值?

[英]How to value exact match higher than term frequency in elasticsearch?

I have an index that has several title fields.我有一个包含多个标题字段的索引。

main_title, sub_titles, preferred_titles etc. main_title、sub_titles、preferred_titles 等。

These texts fields also have a suggest field each where I run a custom analyzer that uses edge-n-gram tokenizer so we can search as we type.这些文本字段还有一个建议字段,我在每个字段中运行一个使用 edge-n-gram 分词器的自定义分析器,以便我们可以在键入时进行搜索。

I would like to value exact match over term frequency.我想重视完全匹配而不是词频。 And exact match in main_title is worth more than exact match in preferred_titles. main_title 中的精确匹配比 preferred_titles 中的精确匹配更有价值。

Anyone know how I can achieve this?任何人都知道我怎么能做到这一点? Thanks in advance.提前致谢。

I have tried a bool_query with multi_match_query in the must clause.我在 must 子句中尝试了带有 multi_match_query 的 bool_query。 The multi_match is crossfields with no fields attached with the operator 'and'. multi_match 是没有附加运算符“and”的字段的交叉字段。

I have both the text fields and the suggest fields in the should cluase.我在 should 子句中同时拥有文本字段和建议字段。 Each text field is in a match_query with a boost and the operator 'and'.每个文本字段都在带有提升和运算符“and”的 match_query 中。 Each suggest field is in a match_phrase_query with a boost and the operator 'and'.每个建议字段都在带有提升和运算符“and”的 match_phrase_query 中。 The issue is that several boosts are added on top of the scores and I end up with very inflated scores.问题是在分数之上添加了几个提升,我最终得到了非常夸大的分数。

You can use rescore .您可以使用重新评分

Rescoring can help to improve precision by reordering just the top (eg 100 - 500) documents returned by the query and post_filter phases, using a secondary (usually more costly) algorithm, instead of applying the costly algorithm to all documents in the index.重新评分可以帮助提高精度,方法是仅对查询和 post_filter 阶段返回的顶部(例如 100 - 500)文档进行重新排序,使用辅助(通常成本更高)算法,而不是将成本算法应用于索引中的所有文档。

An example:一个例子:

{
  "query": {
    ... some query
  },
  "from" : 0,
  "size" : 50,
  "rescore" : {
      "score_normalizer" : {
        "normalizer_type" : "min_max",
        "min_score" : 1,
        "max_score" : 10
      }
   }
}

Reference: https://github.com/bkatwal/elasticsearch-score-normalizer参考: https://github.com/bkatwal/elasticsearch-score-normalizer

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM