简体   繁体   English

Elasticsearch 匹配完整的术语数组

[英]Elasticsearch match complete array of terms

I need to match a complete array of terms with elasticsearch.我需要将一个完整的术语数组与 elasticsearch 匹配。 Only documents that have a array with the same elements should be returned.只应返回具有相同元素数组的文档。 There should be neither more elements nor a subset of elements in the document's array.文档数组中不应有更多元素或元素子集。 The order of elements does not matter.元素的顺序无关紧要。

Example:例子:

 filter:
   id: ["a", "b"]

 documents:  
   id: ["a", "b"] -> match  
   id: ["b", "a"] -> match  
   id: ["a"] -> no match  
   id: ["a", "b", "c"] -> no match  

Eventually I want to use Java High Level REST Client to implement the query, though a example for elasticsearch dsl will do as well.最终,我想使用 Java 高级 REST 客户端来实现查询,尽管 elasticsearch dsl 的示例也可以。

I'd like to propose something that will prevent you from maintaining a long chain of "must" conditions as soon as your requirements will change (eg, imagine you have an array of six items to match).我想提出一些建议,当您的需求发生变化时,它会阻止您维护一长串“必须”条件(例如,假设您有六个要匹配的项目)。 I'm going to rely on a script query, which might look like over-engineered but it will be easy to create a search template out of it ( https://www.elastic.co/guide/en/elasticsearch/reference/7.5/search-template.html ).我将依赖一个脚本查询,它可能看起来像过度设计,但很容易从中创建一个搜索模板( https://www.elastic.co/guide/en/elasticsearch/reference/ 7.5/search-template.html )。

{
"query": {
    "bool": {
      "filter": {
        "script": {
          "script": {
            "source": """
              def ids = new ArrayList(doc['id.keyword']);
              def param = new ArrayList(params.terms);
              def isSameSize = ids.size() == param.size();
              def isSameContent = ids.containsAll(param);
              return isSameSize && isSameContent
            """,
            "lang": "painless",
            "params": {
              "terms": [ "a", "b" ]
            }
          }
        }
      }
    }
  }
}

This way, the only thing that you will need to change is the value of the terms parameter.这样,您唯一需要更改的就是terms参数的值。

While this does not seem to be supported natively you could go ahead and use a script filter to achieve this behavior like so:虽然这似乎不受本机支持,但您可以继续使用脚本过滤器来实现这种行为,如下所示:

GET your_index/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "script": {
            "script": "doc['tags'].values.length == 2"
          }
        },
        {
          "term": {
            "tags": {
              "value": "a"
            }
          }
        },
        {
          "term": {
            "tags": {
              "value": "b"
            }
          }
        }
      ]
    }
  }
}

The script filter limits the search result by the array size while the term filters specify the values of that array.脚本过滤器通过数组大小限制搜索结果,而术语过滤器指定该数组的值。 Make sure to enable fielddata on the tags field in order to execute scripts on it.确保在标签字段上启用fielddata以便在其上执行脚本。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM