Elasticsearch-通配符搜索的亮点

Question

I have found a behaviour of highlight of wildcard little different. 我发现通配符高亮的行为几乎没有什么不同。 When I search using single " " ie, a wildcard character, It does not highlight any of the values. 当我使用单个“ ”（即通配符）进行搜索时，它不会突出显示任何值。 But if I do the same using two or more " " ie, wildcard character, It does highlight all the values. 但是，如果我使用两个或多个“ ”即通配符进行相同的操作 ，它将突出显示所有值。 Although the results fetched are the same, why is there such a difference in highlight? 尽管获取的结果是相同的，但为什么高光显示会有这样的差异？ example : 例如：

1. Multiple wildcards 1.多个通配符

{
  "from": 0,
  "size": 10,
  "_source": {
    "includes": [
      "ID"
    ]
  },
  "query": {
    "bool": {
      "should": [
        {
          "bool": {
            "must": [
              {
                "query_string": {
                  "query": "**",
                  "fields": [
                    "ID"
                  ]
                }
              }
            ]
          }
        }
      ],
      "minimum_should_match": 1
    }
  },
  "highlight": {
    "type": "unified",
    "fragment_size": 0,
    "order": "score",
    "number_of_fragments": 4,
    "fields": {
      "*": {}
    }
  }
}

Results : 结果：

{
  "_index": "index_name",
  "_type": "_doc",
  "_id": "AUTO",
  "_score": 1,
  "_source": {
    "ID": "AUTO"
  },
  "highlight": {
    "ID": [
      "<em>AUTO</em>"
    ]
  }
}

2. Singular Wildcard: 2.奇异通配符：

{
  "from": 0,
  "size": 10,
  "_source": {
    "includes": [
      "ID"
    ]
  },
  "query": {
    "bool": {
      "should": [
        {
          "bool": {
            "must": [
              {
                "query_string": {
                  "query": "*",
                  "fields": [
                    "ID"
                  ]
                }
              }
            ]
          }
        }
      ],
      "minimum_should_match": 1
    }
  },
  "highlight": {
    "type": "unified",
    "fragment_size": 0,
    "order": "score",
    "number_of_fragments": 4,
    "fields": {
      "*": {}
    }
  }
}

Results : 结果：

{
  "_index": "index_name",
  "_type": "_doc",
  "_id": "AUTO",
  "_score": 1,
  "_source": {
    "ID": "AUTO"
  }
}

Answer 1

I do not have a direct answer to your question about why these queries behave different. 对于这些查询为何表现不同的问题，我没有直接答案。 But you can use a wildcard query instead. 但是您可以改用通配符查询。

{
  "_source": {
    "includes": [
      "ID"
    ]
  },
  "query": {
    "bool": {
      "should": [
        {
          "bool": {
            "must": [
              {
                "wildcard": {
                  "ID": "*"
                }
              }
            ]
          }
        }
      ],
      "minimum_should_match": 1
    }
  },
  "highlight": {
    "type": "unified",
    "fragment_size": 0,
    "order": "score",
    "number_of_fragments": 4,
    "fields": {
      "*": {}
    }
  }
}

Hope this helps. 希望这可以帮助。

Answer 2

Please take a look at the documentation of Query String Query . 请查看Query String Query的文档。 There is: 有：

Pure wildcards \\* are rewritten to exists queries for efficiency. 纯通配符\\*被重写为exists查询以提高效率。 As a consequence, the wildcard "field:*" would match documents with an empty value like the following: { "field": "" } ... and would not match if the field is missing or set with an explicit null value like the following: { "field": null } 结果，通配符"field:*"将匹配具有空值的文档，如下所示： { "field": "" } ...并且如果该字段丢失或设置了显式null值（例如，以下内容： { "field": null }

So I'm guessing singular * is treated in some special way because of this. 因此，我猜测单数*会以某种特殊方式处理。

Elasticsearch-通配符搜索的亮点

问题描述

1. Multiple wildcards 1.多个通配符

2. Singular Wildcard: 2.奇异通配符：

2 个解决方案

解决方案1
0 2019-01-28 10:53:17

解决方案2
0 2019-01-28 10:59:23

Elasticsearch-通配符搜索的亮点

问题描述

1. Multiple wildcards 1.多个通配符

2. Singular Wildcard: 2.奇异通配符：

2 个解决方案

解决方案1 0 2019-01-28 10:53:17

解决方案2 0 2019-01-28 10:59:23

解决方案1
0 2019-01-28 10:53:17

解决方案2
0 2019-01-28 10:59:23