简体   繁体   English

Elasticsearch-通配符搜索的亮点

[英]Elasticsearch - Highlight of Wildcard Search

I have found a behaviour of highlight of wildcard little different. 我发现通配符高亮的行为几乎没有什么不同。 When I search using single " " ie, a wildcard character, It does not highlight any of the values. 当我使用单个“ ”(即通配符)进行搜索时,它不会突出显示任何值。 But if I do the same using two or more " " ie, wildcard character, It does highlight all the values. 但是,如果我使用两个或多个“ ”即通配符进行相同的操作 ,它将突出显示所有值。 Although the results fetched are the same, why is there such a difference in highlight? 尽管获取的结果是相同的,但为什么高光显示会有这样的差异? example : 例如:

1. Multiple wildcards 1.多个通配符

{
  "from": 0,
  "size": 10,
  "_source": {
    "includes": [
      "ID"
    ]
  },
  "query": {
    "bool": {
      "should": [
        {
          "bool": {
            "must": [
              {
                "query_string": {
                  "query": "**",
                  "fields": [
                    "ID"
                  ]
                }
              }
            ]
          }
        }
      ],
      "minimum_should_match": 1
    }
  },
  "highlight": {
    "type": "unified",
    "fragment_size": 0,
    "order": "score",
    "number_of_fragments": 4,
    "fields": {
      "*": {}
    }
  }
}

Results : 结果:

{
  "_index": "index_name",
  "_type": "_doc",
  "_id": "AUTO",
  "_score": 1,
  "_source": {
    "ID": "AUTO"
  },
  "highlight": {
    "ID": [
      "<em>AUTO</em>"
    ]
  }
}

2. Singular Wildcard: 2.奇异通配符:

{
  "from": 0,
  "size": 10,
  "_source": {
    "includes": [
      "ID"
    ]
  },
  "query": {
    "bool": {
      "should": [
        {
          "bool": {
            "must": [
              {
                "query_string": {
                  "query": "*",
                  "fields": [
                    "ID"
                  ]
                }
              }
            ]
          }
        }
      ],
      "minimum_should_match": 1
    }
  },
  "highlight": {
    "type": "unified",
    "fragment_size": 0,
    "order": "score",
    "number_of_fragments": 4,
    "fields": {
      "*": {}
    }
  }
}   

Results : 结果:

{
  "_index": "index_name",
  "_type": "_doc",
  "_id": "AUTO",
  "_score": 1,
  "_source": {
    "ID": "AUTO"
  }
}

I do not have a direct answer to your question about why these queries behave different. 对于这些查询为何表现不同的问题,我没有直接答案。 But you can use a wildcard query instead. 但是您可以改用通配符查询。

{
  "_source": {
    "includes": [
      "ID"
    ]
  },
  "query": {
    "bool": {
      "should": [
        {
          "bool": {
            "must": [
              {
                "wildcard": {
                  "ID": "*"
                }
              }
            ]
          }
        }
      ],
      "minimum_should_match": 1
    }
  },
  "highlight": {
    "type": "unified",
    "fragment_size": 0,
    "order": "score",
    "number_of_fragments": 4,
    "fields": {
      "*": {}
    }
  }
}

Hope this helps. 希望这可以帮助。

Please take a look at the documentation of Query String Query . 请查看Query String Query的文档 There is: 有:

Pure wildcards \\* are rewritten to exists queries for efficiency. 纯通配符\\*被重写为exists查询以提高效率。 As a consequence, the wildcard "field:*" would match documents with an empty value like the following: { "field": "" } ... and would not match if the field is missing or set with an explicit null value like the following: { "field": null } 结果,通配符"field:*"将匹配具有空值的文档,如下所示: { "field": "" } ...并且如果该字段丢失或设置了显式null值(例如,以下内容: { "field": null }

So I'm guessing singular * is treated in some special way because of this. 因此,我猜测单数*会以某种特殊方式处理。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM