简体   繁体   English

在 OpenSearch / ElasticSearch 聚合中连接字段

[英]Concatenating fields in OpenSearch / ElasticSearch aggregate

I have an OpenSearch index with the following mapping (simplified):我有一个具有以下映射(简化)的 OpenSearch 索引:

PUT /house
{
  "mappings": {
    "properties": {
      "house": { "type": "keyword" },
      "people": {
        "type": "nested",
        "properties": {
          "forename": { "type": "keyword" },
          "surname": { "type": "keyword" }
        }
      }
    }
  }
}

I'd like to retrieve an aggregate where the bucket key is "[forename] [surname]".我想检索存储桶键为“[forename] [surname]”的聚合。

Toy data:玩具数据:

PUT /house/_doc/1
{
  "house": "house1",
  "people": [
    { "forename": "Dave", "surname": "Daveson" },
    { "forename": "Jeff", "surname": "Jeffson" }
  ]
}

PUT /house/_doc/2
{
  "house": "house1",
  "people": [
    { "forename": "Dave", "surname": "Daveson" },
    { "forename": "Jeffs", "surname": "Jeffsons" }
  ]
}

The following doesn't return what I'd expect, and I can't figure out what object paths to put in the script to get it to work:以下内容没有返回我所期望的,我无法弄清楚要在脚本中放置哪些对象路径以使其工作:

GET house/_search
{
  "aggs": {
    "people": {
      "nested": {
        "path": "people"
      },
      "aggs": {
        "people.name": {
          "terms": {
            "script": "[params._source['forename'], params._source['surname']].join(' ')"
          }
        }
      }
    }
  },
  "size": 0
}

Returns:回报:

{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "people" : {
      "doc_count" : 4,
      "people.name" : {
        "doc_count_error_upper_bound" : 0,
        "sum_other_doc_count" : 0,
        "buckets" : [
          {
            "key" : "null null",
            "doc_count" : 4
          }
        ]
      }
    }
  }
}

Without script I can aggregate correctly on forename , surname or both, but using both I can't reliably "join" the results since they can be sorted only on the doc_count or key:如果没有script ,我可以在forenamesurname或两者上正确聚合,但是使用两者我不能可靠地“加入”结果,因为它们只能在 doc_count 或键上排序:

GET house/_search
{
  "aggs": {
    "people": {
      "nested": {
        "path": "people"
      },
      "aggs": {
        "people.forename": {
          "terms": { "field": "people.forename" }
        },
        "people.surname": {
          "terms": { "field": "people.surname" }
        }
      }
    }
  },
  "size": 0
}

Returns:回报:

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "people" : {
      "doc_count" : 4,
      "people.surname" : {
        "doc_count_error_upper_bound" : 0,
        "sum_other_doc_count" : 0,
        "buckets" : [
          {
            "key" : "Daveson",
            "doc_count" : 2
          },
          {
            "key" : "Jeffson",
            "doc_count" : 1
          },
          {
            "key" : "Jeffsons",
            "doc_count" : 1
          }
        ]
      },
      "people.forename" : {
        "doc_count_error_upper_bound" : 0,
        "sum_other_doc_count" : 0,
        "buckets" : [
          {
            "key" : "Dave",
            "doc_count" : 2
          },
          {
            "key" : "Jeff",
            "doc_count" : 1
          },
          {
            "key" : "Jeffs",
            "doc_count" : 1
          }
        ]
      }
    }
  }
}

You want this results:你想要这个结果:

GET house/_search
{
  "aggs": {
    "people": {
      "nested": {
        "path": "people"
      },
      "aggs": {
        "people.name": {
          "terms": {
            "script": "doc['people.forename'].value + ' ' +  doc['people.surname'].value"
          }
        }
      }
    }
  },
  "size": 0
}

Results:结果:

"aggregations" : {
    "people" : {
      "doc_count" : 4,
      "people.name" : {
        "doc_count_error_upper_bound" : 0,
        "sum_other_doc_count" : 0,
        "buckets" : [
          {
            "key" : "Dave Daveson",
            "doc_count" : 2
          },
          {
            "key" : "Jeff Jeffson",
            "doc_count" : 1
          },
          {
            "key" : "Jeffs Jeffsons",
            "doc_count" : 1
          }
        ]
      }
    }
  }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM