简体   繁体   English

使用过滤器数组字段查询

[英]Query with filter array field

I want to return documents that include only some of array field members.我想返回仅包含一些数组字段成员的文档。
For example, I have of two order documents:\例如,我有两个订单文件:\

{   
    "orderNumber":"ORD-111",
    "items":[{"name":"part-1","status":"new"},
             {"name":"part-2","status":"paid"}]
}
{
    "orderNumber":"ORD-112",
    "items":[{"name":"part-3","status":"paid"},
             {"name":"part-4","status":"supplied"}]
}

I want to create a query so that my result will include all the order documents but only with items that match {"status":"supplied"}.我想创建一个查询,以便我的结果将包括所有订单文档,但仅包含与 {"status":"supplied"} 匹配的项目。
The result should look like:\结果应如下所示:\

{   
    "orderNumber":"ORD-111",
    "items":[]
}
{
    "orderNumber":"ORD-112",
    "items":[{"name":"part-4","status":"supplied"}]
}

You can use a nested query along with inner_hits to get only matching array values in the result您可以使用嵌套查询inner_hits在结果中仅获取匹配的数组值

Adding a working example添加一个工作示例

Index Mapping:索引映射:

{
  "mappings": {
    "properties": {
      "items": {
        "type": "nested"
      }
    }
  }
}

Search Query:搜索查询:

{
  "query": {
    "nested": {
      "path": "items",
      "query": {
        "bool": {
          "must": [
            {
              "match": {
                "items.status": "supplied"
              }
            }
          ]
        }
      },
      "inner_hits": {}
    }
  }
}

Search Result:搜索结果:

"hits": [
      {
        "_index": "67890614",
        "_type": "_doc",
        "_id": "2",
        "_score": 1.2039728,
        "_source": {
          "orderNumber": "ORD-112",
          "items": [
            {
              "name": "part-3",
              "status": "paid"
            },
            {
              "name": "part-4",
              "status": "supplied"
            }
          ]
        },
        "inner_hits": {
          "items": {
            "hits": {
              "total": {
                "value": 1,
                "relation": "eq"
              },
              "max_score": 1.2039728,
              "hits": [
                {
                  "_index": "67890614",
                  "_type": "_doc",
                  "_id": "2",
                  "_nested": {
                    "field": "items",
                    "offset": 1
                  },
                  "_score": 1.2039728,
                  "_source": {
                    "name": "part-4",
                    "status": "supplied"      // note this
                  }
                }
              ]
            }
          }
        }
      }
    ]

Elasticsearch flats the matching field so is unable to tell which was the actual element in the array that matches. Elasticsearch 使匹配字段变平,因此无法分辨数组中匹配的实际元素。

As previously answered you could use nested queries.如前所述,您可以使用嵌套查询。

How arrays of objects are flattened Elasticsearch has no concept of inner objects.对象的 arrays 如何展平 Elasticsearch 没有内部对象的概念。 Therefore, it flattens object hierarchies into a simple list of field names and values.因此,它将 object 层次结构扁平化为字段名称和值的简单列表。 For instance, consider the following document:例如,考虑以下文档:

PUT my-index-000001/_doc/1
{
  "group" : "fans",
  "user" : [ 
    {
      "first" : "John",
      "last" :  "Smith"
    },
    {
      "first" : "Alice",
      "last" :  "White"
    }
  ]
}

The user field is dynamically added as a field of type object.用户字段作为 object 类型的字段动态添加。

The previous document would be transformed internally into a document that looks more like this:之前的文档将在内部转换为看起来更像这样的文档:

{
  "group" :        "fans",
  "user.first" : [ "alice", "john" ],
  "user.last" :  [ "smith", "white" ]
}

The user.first and user.last fields are flattened into multi-value fields, and the association between alice and white is lost. user.first 和 user.last 字段被扁平化为多值字段,失去了 alice 和 white 的关联。 This document would incorrectly match a query for alice AND smith:该文档将错误地匹配对 alice AND smith 的查询:

GET my-index-000001/_search
{
  "query": {
    "bool": {
      "must": [
        { "match": { "user.first": "Alice" }},
        { "match": { "user.last":  "Smith" }}
      ]
    }
  }
}
 

To answer your question:要回答您的问题:

If you need to index arrays of objects and to maintain the independence of each object in the array, use the nested data type instead of the object data type.如果需要索引对象的 arrays 并保持数组中每个 object 的独立性,请使用嵌套数据类型而不是 object 数据类型。

Internally, nested objects index each object in the array as a separate hidden document, meaning that each nested object can be queried independently of the others with the nested query:在内部,嵌套对象将数组中的每个 object 索引为单独的隐藏文档,这意味着每个嵌套的 object 都可以使用嵌套查询独立于其他对象进行查询:

PUT my-index-000001
{
  "mappings": {
    "properties": {
      "user": {
        "type": "nested" 
      }
    }
  }
}

PUT my-index-000001/_doc/1
{
  "group" : "fans",
  "user" : [
    {
      "first" : "John",
      "last" :  "Smith"
    },
    {
      "first" : "Alice",
      "last" :  "White"
    }
  ]
}

GET my-index-000001/_search
{
  "query": {
    "nested": {
      "path": "user",
      "query": {
        "bool": {
          "must": [
            { "match": { "user.first": "Alice" }},
            { "match": { "user.last":  "Smith" }} 
          ]
        }
      }
    }
  }
}

GET my-index-000001/_search
{
  "query": {
    "nested": {
      "path": "user",
      "query": {
        "bool": {
          "must": [
            { "match": { "user.first": "Alice" }},
            { "match": { "user.last":  "White" }} 
          ]
        }
      },
      "inner_hits": { 
        "highlight": {
          "fields": {
            "user.first": {}
          }
        }
      }
    }
  }
}

The user field is mapped as type nested instead of type object.用户字段映射为嵌套类型,而不是类型 object。

This query doesn't match because Alice and Smith are not in the same nested object.此查询不匹配,因为 Alice 和 Smith 不在同一个嵌套 object 中。

This query matches because Alice and White are in the same nested object.此查询匹配,因为 Alice 和 White 在同一个嵌套 object 中。

inner_hits allow us to highlight the matching nested documents. inner_hits 允许我们突出显示匹配的嵌套文档。

Interacting with nested documents Nested documents can be:与嵌套文档交互嵌套文档可以是:

queried with the nested query.使用嵌套查询进行查询。 analyzed with the nested and reverse_nested aggregations.使用嵌套和 reverse_nested 聚合进行分析。 sorted with nested sorting.使用嵌套排序进行排序。 retrieved and highlighted with nested inner hits.用嵌套的内部匹配检索和突出显示。 Because nested documents are indexed as separate documents, they can only be accessed within the scope of the nested query, the nested/reverse_nested aggregations, or nested inner hits.因为嵌套文档被索引为单独的文档,所以它们只能在嵌套查询的 scope、nested/reverse_nested 聚合或嵌套内部命中的范围内访问。

consider performance when taking this approach as it is by magnitudes more expensive.采用这种方法时要考虑性能,因为它的成本要高得多。

for more details更多细节

ou can check the source: https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html您可以查看来源: https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM