簡體   English   中英

Elasticsearch:首選前綴匹配而不是術語匹配

[英]Elasticsearch: prefer prefix match over term match

我在 elasticsearch 索引中有一個字段,我正在嘗試搜索,我希望該字段的值以搜索詞開頭的文檔高於文檔,其中該詞位於中間的長句。 例如:搜索“lorem”時,

{
  "title": "Lorem"
}

分數應該高於

{
  "title": "The time I said Lorem"
}

或者

{
  "title": "The Lorem"
}

甚至

{
  "title": "Lorem impsum"
}

然而,簡單的matchmatch_phrase_prefixquery_string查詢通常不是這種情況。

到目前為止,我已經嘗試在提升前綴的同時將prefix查詢與match查詢結合起來,但提升似乎並沒有像我預期的那樣工作,即結果是相同的,只是提升了 10

...
{
    "should": [
        {
            "prefix": {
                "title": {
                    "value": query,
                    "boost": 10
                }
            }
        }
        {
            "match": {
                "title": {
                    "query":     query,
                    "boost":     3,
                    "fuzziness": "AUTO"
                }
            }
        }
    ]
}
...

此外,不確定這是否相關,但title字段實際上是嵌套的,即它是alternative_names.title

elasticsearch 有什么優雅的解決方案嗎?

您可以使用組合bool/should子句來實現所需的結果。

添加一個工作示例

索引映射:

{
  "mappings": {
    "properties": {
      "alternative_names": {
        "type": "nested",
        "properties": {
          "title": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword"
              }
            }
          }
        }
      }
    }
  }
}

指數數據:

{
  "alternative_names": {
    "title": "Lorem"
  }
}
{
  "alternative_names": {
    "title": "The time I said Lorem"
  }
}
{
  "alternative_names": {
    "title": "The Lorem"
  }
}
{
  "alternative_names": {
    "title": "Lorem impsum"
  }
}

搜索查詢:

{
  "query": {
    "nested": {
      "path": "alternative_names",
      "query": {
        "bool": {
          "should": [
            {
              "term": {
                "alternative_names.title.keyword": "Lorem"
              }
            },
            {
              "match": {
                "alternative_names.title": "Lorem"
              }
            }
          ]
        }
      }
    }
  }
}

搜索結果:

"hits": [
      {
        "_index": "66500753",
        "_type": "_doc",
        "_id": "1", 
        "_score": 1.3436072,
        "_source": {
          "alternative_names": {          // note this
            "title": "Lorem"
          }
        }
      },
      {
        "_index": "66500753",
        "_type": "_doc",
        "_id": "4",
        "_score": 0.11474907,
        "_source": {
          "alternative_names": {
            "title": "Lorem impsum"
          }
        }
      },
      {
        "_index": "66500753",
        "_type": "_doc",
        "_id": "3",
        "_score": 0.11474907,
        "_source": {
          "alternative_names": {
            "title": "The Lorem"
          }
        }
      },
      {
        "_index": "66500753",
        "_type": "_doc",
        "_id": "2",
        "_score": 0.07477197,
        "_source": {
          "alternative_names": {
            "title": "The time I said Lorem"
          }
        }
      }
    ]

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM