elasticsearch：使用ngram分析器时避免重复计分

Question

Suppose I search for "hello" when the document contains "hello" and "hello hello" I want "hello" to have higher scoring. 假设我在文档包含“ hello”和“ hello hello”时搜索“ hello”，我希望“ hello”具有更高的评分。

I am using ngram index and search analyzer. 我正在使用ngram索引和搜索分析器。 (Because I really need this for other scenarios) So "hello hello" gets matched twice and hence shows as the top result. （因为在其他情况下我确实需要此功能），因此“ hello hello”被匹配两次，因此显示为最佳结果。 Is there any way I can avoid this? 有什么办法可以避免这种情况吗？ I have already tried term query, match phrase query, multi match queries all of them scores "hello hello" higher. 我已经尝试过术语查询，匹配短语查询，多匹配查询，它们的得分都更高。

Answer 1

I solved this by adding a duplicate unanalyzed (keyword) column for the document and used bool clause to boost the term query. 我通过为文档添加重复的未分析（关键字）列来解决此问题，并使用bool子句来增强术语查询。

var res = client.Search<MyClass>(s => s
  .Query(q => q
    .Bool(
        b1 => b1.Should(
            s1 =>s1
            .Term(m=>m
                .Field(f => f._DUPLICATE_COLUMN)
                .Value("hello")
                .Boost(1)
            ),

            s1=>s1.Match(m => m
            .Field(f => f.MY_COLUMN)
            .Query("hello")
            .Analyzer("myNgramSearchAnalyzer")
            )
        )
        .MinimumShouldMatch(1)
    )
  )
);

elasticsearch：使用ngram分析器时避免重复计分

问题描述

1 个解决方案

解决方案1
0 已采纳 2017-07-06 21:32:47

elasticsearch：使用ngram分析器时避免重复计分

问题描述

1 个解决方案

解决方案1 0 已采纳 2017-07-06 21:32:47

解决方案1
0 已采纳 2017-07-06 21:32:47