简体   繁体   English

比较elasticsearch索引中不同的文档字段值

[英]Comparing the different document field values in elasticsearch index

Just wondering How can I compare the value from 2 different documents .只是想知道如何比较 2 个不同文档的值。 I am ingesting the following values in the index.我正在索引中摄取以下值。

I am looking to query and compare of each document field "type_instance" value of "allocated-mb" > "max-mb" of the each plugin_instance .我希望查询和比较每个 plugin_instance 的“allocated-mb”>“max-mb”的每个文档字段“type_instance”值。

POST test/_bulk
{"index":{"_id":1}}
{"plugin_instance": "root-yarn-queue-name-1", "type_instance": "allocated-mb", "value": 4024}
{"index":{"_id":2}}
{ "plugin_instance": "root-yarn-queue-name-1", "type_instance": "max-mb", "value": 2048}
{"index":{"_id":3}}
{"plugin_instance": "root-yarn-queue-name-2", "type_instance": "max-mb", "value": 3048}
{"index":{"_id":4}}
{"root-yarn-queue-name-2", "type_instance": "allocated-mb", "value": 1028}
{"index":{"_id":5}}
{"plugin_instance": "some-random-queue-name-2", "type_instance": "allocated-mb", "value": 2028}
{"index":{"_id":6}}
{"plugin_instance": "some-random-queue-name-2", "type_instance": "max-mb", "value": 2028}

just wonder what would the easy way to achieve following只是想知道实现以下目标的简单方法是什么

  1. Select records with plugin_instance=root-yarn-queue-name-*选择带有 plugin_instance=root-yarn-queue-name-* 的记录
  2. select records with type_instance in (allocated-mb, allocated-vcores, max-mb, max-vcores)选择带有 type_instance 的记录(allocated-mb、located-vcores、max-mb、max-vcores)
  3. group records with same plugin_instance (to get records for a queue in a bucket)将具有相同 plugin_instance 的记录分组(获取存储桶中队列的记录)
  4. compare if value of allocated-mb > max-mb and allocated-vcore > max-vcore and select those records which fulfil all these conditions in a given time period比较分配的 mb > max-mb 和分配的 vcore > max-vcore 的值,并选择在给定时间段内满足所有这些条件的那些记录

till now I have managed to bucket the document based on the plugin_instance .到目前为止,我已经设法根据 plugin_instance 对文档进行存储。 Wondering what would be the easy way to compare the documents of each bucket based on the "type_instance"想知道根据“type_instance”比较每个存储桶的文档的简单方法是什么

GET test/_search
{
  "query": {
    "query_string": {
      "fields": [
        "plugin_instance.keyword",
        "type_instance.keyword"
        ],
      "query": "root-yarn-queue-name-* AND (max-mb OR allocated-mb)"
    }
  },
    "aggs": {
    "byField": {
      "terms": {
        "field": "plugin_instance.keyword"
      }
    }
  }
}

One way to solve above will be to get documents and peform filtering at client side.解决上述问题的一种方法是在客户端获取文档和执行过滤。 Second will involve use of bucket_selector aggregation第二个将涉及使用bucket_selector 聚合

Mapping映射

{
  "index60" : {
    "mappings" : {
      "properties" : {
        "plugin_instance" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "type_instance" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "value" : {
          "type" : "long"
        }
      }
    }
  }
}

Data:数据:

  "hits" : [
      {
        "_index" : "index60",
        "_type" : "_doc",
        "_id" : "vCN--3AB_Wo5Rvhl5c7f",
        "_score" : 1.0,
        "_source" : {
          "plugin_instance" : "root-yarn-queue-name-1",
          "type_instance" : "max-vcore",
          "value" : 1000
        }
      },
      {
        "_index" : "index60",
        "_type" : "_doc",
        "_id" : "vSN--3AB_Wo5Rvhl9M6R",
        "_score" : 1.0,
        "_source" : {
          "plugin_instance" : "root-yarn-queue-name-1",
          "type_instance" : "max-mb",
          "value" : 2048
        }
      },
      {
        "_index" : "index60",
        "_type" : "_doc",
        "_id" : "viN--3AB_Wo5Rvhl_s5m",
        "_score" : 1.0,
        "_source" : {
          "plugin_instance" : "root-yarn-queue-name-2",
          "type_instance" : "max-mb",
          "value" : 3048
        }
      },
      {
        "_index" : "index60",
        "_type" : "_doc",
        "_id" : "wCN_-3AB_Wo5Rvhlhc53",
        "_score" : 1.0,
        "_source" : {
          "plugin_instance" : "root-yarn-queue-name-2",
          "type_instance" : "allocated-mb",
          "value" : 1028
        }
      },
      {
        "_index" : "index60",
        "_type" : "_doc",
        "_id" : "wSOA-3AB_Wo5RvhlCc5r",
        "_score" : 1.0,
        "_source" : {
          "plugin_instance" : "root-yarn-queue-name-1",
          "type_instance" : "allocated-mb",
          "value" : 3000
        }
      }
    ]

Query:询问:

  1. Create a bucket of root-yarn-queue-name-.*创建一个桶 root-yarn-queue-name-.*
  2. Create sub buckets of ["max-mb","allocated-mb"]创建 ["max-mb","allocated-mb"] 的子桶
  3. Find max value of ["max-mb","allocated-mb"]找到 ["max-mb","allocated-mb"] 的最大值
  4. Find value of allocated-mb查找分配的 mb 的值
  5. Select a bucket where allocated-mb has max value选择一个已分配-mb 具有最大值的存储桶
{
  "size": 0,
  "aggs": {
    "plugin_instance": {
      "terms": {
        "field": "plugin_instance.keyword",
        "include": "root-yarn-queue-name-.*", --> pattern
        "size": 10000
      },
      "aggs": {
        "instance": {
          "terms": {
            "field": "type_instance.keyword",
            "include": [  --> create a bucket of these two
              "max-mb",
              "allocated-mb"
            ],
            "size": 10
          }
        },
        "maxValue": {
          "max": {
            "field": "value"
          }
        },
        "allocated-mb": {
          "filter": {
            "term": {
              "type_instance.keyword": "allocated-mb"
            }
          },
          "aggs": {
            "filtered_maxValue": {
              "max": {
                "field": "value"
              }
            }
          }
        },
        "my_bucket": {
          "bucket_selector": {
            "buckets_path": {
              "filteredValue": "allocated-mb>filtered_maxValue",
              "maxValue": "maxValue"
            },
            "script": "params.filteredValue==params.maxValue"
          }
        }
      }
    }
  }
}

Result:结果:

 "aggregations" : {
    "plugin_instance" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "root-yarn-queue-name-1",
          "doc_count" : 3,
          "instance" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "allocated-mb",
                "doc_count" : 1
              },
              {
                "key" : "max-mb",
                "doc_count" : 1
              }
            ]
          },
          "maxValue" : {
            "value" : 3000.0
          },
          "allocated-mb" : {
            "doc_count" : 1,
            "filtered_maxValue" : {
              "value" : 3000.0
            }
          }
        }
      ]
    }
  }

Let me know if you have any doubt in this如果您对此有任何疑问,请告诉我

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 elasticsearch - 将嵌套字段与文档中的另一个字段进行比较 - elasticsearch - comparing a nested field with another field in the document 用字段条件替换 Elasticsearch 索引中的文档 - Replace document in Elasticsearch index with field condition 使用现有的“ id”字段索引Elasticsearch文档 - index Elasticsearch document with existing “id” field 如何在elasticsearch中索引包含ZonedDateTime字段的文档 - How to index document containing ZonedDateTime field in elasticsearch 在 elasticsearch 中使用不同的文档类型创建索引 - Create index in elasticsearch with different document types Elasticsearch 聚合子文档字段值 - Elasticsearch aggregation over children document field values Elasticsearch - 如何使用不同的分析器为同一个字段编制索引 - Elasticsearch - How to index the same field with different analyzers 如何通过索引文档中字段的部分文本来搜索Elasticsearch索引? - How to search a elasticsearch index by partial text of a field in the indexed document? 如何将路径字段添加到 ElasticSearch 索引中的每个文档? - How to add path field to every document in an ElasticSearch index? elasticsearch中的索引和更新文档有什么区别? - What's the different between index and update document in elasticsearch?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM