简体   繁体   English

ES查询匹配数组中的所有元素

[英]ES query to match all elements in array

So I got this document with a nested array that I want to filter with this query.所以我得到了这个带有嵌套数组的文档,我想用这个查询过滤它。

I want ES to return all documents where all items have changes = 0 and that only.我希望 ES 返回所有项目都有更改 = 0 并且仅更改的所有文档。 If document has even a single item in the list with a change = 1, that's discarded.如果文档在列表中甚至有一个更改 = 1 的项目,则将其丢弃。

Is there any way I can achieve this starting from the query I have already wrote?有什么方法可以从我已经编写的查询开始实现这一目标? Or should I use a script instead?还是我应该改用脚本?

DOCUMENTS:文件:

{
    "id": "abc",
    "_source" : {
        "trips" : [
            {
                "type" : "home",
                "changes" : 0
            },
            {
                "type" : "home",
                "changes" : 1
            }
        ]
    }
},
{
        "id": "def",
        "_source" : {
            "trips" : [
                {
                    "type" : "home",
                    "changes" : 0
                },
                {
                    "type" : "home",
                    "changes" : 0
                }
            ]
        }
    }

QUERY:询问:

GET trips_solutions/_search

    {
      "query": {
        "bool": {
          "must": [
            {
              "term": {
                "id": {
                  "value": "abc"
                }
              }
            },
            {
              "nested": {
                "path": "trips",
                "query": {
                  "range": {
                    "trips.changes": {
                      "gt": -1,
                      "lt": 1
                    }
                  }
                }
              }
            }
          ]
        }
      }
    }

EXPECTED RESULT:预期结果:

{
            "id": "def",
            "_source" : {
                "trips" : [
                    {
                        "type" : "home",
                        "changes" : 0
                    },
                    {
                        "type" : "home",
                        "changes" : 0
                    }
                ]
            }
        }

Elasticsearch version: 7.6.2弹性搜索版本:7.6.2

Already read this answers but they didn't help me: https://discuss.elastic.co/t/how-to-match-all-item-in-nested-array/163873 ElasticSearch: How to query exact nested array已经阅读了这个答案,但他们没有帮助我: https: //discuss.elastic.co/t/how-to-match-all-item-in-nested-array/163873 ElasticSearch:如何查询精确的嵌套数组

First off, if you filter by id: abc , you obviously won't be able to get id: def back.首先,如果您按id: abc过滤,您显然将无法获得id: def回来。

Second, due to the nature of nested fields which are treated as separate subdocuments, you cannot query for all trips that have the changes equal to 0 -- the connection between the individual trips is lost and they "don't know about each other".其次,由于被视为单独子文档的nested字段的性质,您无法查询changes等于 0 的所有trips - 各个行程之间的连接丢失并且它们“彼此不了解” .

What you can do is return only the trips that matched your nested query using inner_hits :可以做的是仅使用inner_hits返回与您的嵌套查询匹配的inner_hits

GET trips_solutions/_search
{
  "_source": "false",
  "query": {
    "bool": {
      "must": [
        {
          "nested": {
            "inner_hits": {},
            "path": "trips",
            "query": {
              "term": {
                "trips.changes": {
                  "value": 0
                }
              }
            }
          }
        }
      ]
    }
  }
}

The easiest solution then is to dynamically save this nested info on a parent object like discussed here and using range/term query on the resulting array.最简单的解决方案是将这个嵌套信息动态保存在父对象上,就像这里讨论的那样,并在结果数组上使用范围/术语查询。


EDIT:编辑:

Here's how you do it using copy_to onto the doc's top level:以下是使用copy_to到文档顶层的方法:

PUT trips_solutions
{
  "mappings": {
    "properties": {
      "trips_changes": {
        "type": "integer"
      },
      "trips": {
        "type": "nested",
        "properties": {
          "changes": {
            "type": "integer",
            "copy_to": "trips_changes"
          }
        }
      }
    }
  }
}

trips_changes will be an array of numbers -- I presume they're integers but more types are available . trips_changes将是一个数字数组——我认为它们是整数,但还有更多类型可用

Then syncing a few docs:然后同步一些文档:

POST trips_solutions/_doc
{"trips":[{"type":"home","changes":0},{"type":"home","changes":1}]}

POST trips_solutions/_doc
{"trips":[{"type":"home","changes":0},{"type":"home","changes":0}]}

And finally querying:最后查询:

GET trips_solutions/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "nested": {
            "path": "trips",
            "query": {
              "term": {
                "trips.changes": {
                  "value": 0
                }
              }
            }
          }
        },
        {
          "script": {
            "script": {
              "source": "doc.trips_changes.stream().filter(val -> val != 0).count() == 0"
            }
          }
        }
      ]
    }
  }
}

Note that we first filter normally using the nested term query to narrow down our search context (scripts are slow so this is useful).请注意,我们通常首先使用嵌套术语查询进行过滤以缩小我们的搜索上下文(脚本很慢,因此这很有用)。 We then check if there are any non-zero changes in the accumulated top-level changes and reject those that apply.然后我们检查累积的顶级changes中是否有任何非零更改并拒绝那些适用的更改。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM