简体   繁体   English

如何聚合elasticsearch中的动态字段?

[英]How to aggregate over dynamic fields in elasticsearch?

I am trying to aggregate over dynamic fields (different for different documents) via elasticsearch. 我试图通过elasticsearch聚合动态字段(不同文档不同)。 Documents are like following: 文件如下:

[{
   "name": "galaxy note",
   "price": 123,
   "attributes": {
      "type": "phone",
      "weight": "140gm"
   }
},{
   "name": "shirt",
   "price": 123,
   "attributes": {
      "type": "clothing",
      "size": "m"
   }
}]

As you can see attributes change across documents. 如您所见,文档中的属性发生了变化。 What Im trying to achieve is to aggregate fields of these attributes, like so: 我想要实现的是聚合这些属性的字段,如下所示:

{
     aggregations: {
         types: {
             buckets: [{key: 'phone', count: 123}, {key: 'clothing', count: 12}]
         }
     }
}

I am trying aggregation feature of elasticsearch to achieve this, but not able to find correct way. 我正在尝试使用elasticsearch的聚合功能来实现这一目标,但却无法找到正确的方法。 Is it possible to achieve via aggregation ? 是否有可能通过聚合实现? Or should I start looking in to facets , thought it seem to be depricated. 或者我应该开始寻找方面 ,认为它似乎被剥夺了。

You have to define attributes as nested in your mapping and change the layout of the single attribute values to the fixed layout { key: DynamicKey, value: DynamicValue } 您必须将属性定义为嵌套在映射中,并将单个属性值的布局更改为固定布局{ key: DynamicKey, value: DynamicValue }

PUT /catalog
{
  "settings" : {
    "number_of_shards" : 1
  },
  "mappings" : {
    "article": {
      "properties": {
        "name": { 
          "type" : "string", 
          "index" : "not_analyzed" 
        },
        "price": { 
          "type" : "integer" 
        },
        "attributes": {
          "type": "nested",
          "properties": {
            "key": {
              "type": "string"
            },
            "value": {
              "type": "string"
            }
          }
        }
      }  
    }
  }
}

You may than index your articles like this 您可以像这样索引您的文章

POST /catalog/article
{
  "name": "shirt",
  "price": 123,
  "attributes": [
    { "key": "type", "value": "clothing"},
    { "key": "size", "value": "m"}
  ]
}

POST /catalog/article
{
  "name": "galaxy note",
  "price": 123,
  "attributes": [
    { "key": "type", "value": "phone"},
    { "key": "weight", "value": "140gm"}
  ]
}

After all you are then able to aggregate over the nested attributes 毕竟,您可以聚合嵌套属性

GET /catalog/_search
{
  "query":{
    "match_all":{}
  },     
  "aggs": {
    "attributes": {
      "nested": {
        "path": "attributes"
      },
      "aggs": {
        "key": {
          "terms": {
            "field": "attributes.key"
          },
          "aggs": {
            "value": {
              "terms": {
                "field": "attributes.value"
              }
            }
          }
        }
      }
    }
  }
}

Which then gives you the information you requested in a slightly different form 然后,它会以稍微不同的形式为您提供所需的信息

[...]
"buckets": [
  {
    "key": "type",
    "doc_count": 2,
    "value": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
      {
        "key": "clothing",
        "doc_count": 1
      }, {
        "key": "phone",
        "doc_count": 1
      }
      ]
    }
  },
[...]

Not sure if this is what you mean, but this is fairly simple with basic aggregation functionality. 不确定这是不是你的意思,但这对于基本的聚合功能来说相当简单。 Beware I did not include a mapping so with type of multiple words you are getting double results. 请注意,我没有包含映射,所以使用多个单词的类型会得到双重结果。

POST /product/atype
{
   "name": "galaxy note",
   "price": 123,
   "attributes": {
      "type": "phone",
      "weight": "140gm"
   }
}

POST /product/atype
{
   "name": "shirt",
   "price": 123,
   "attributes": {
      "type": "clothing",
      "size": "m"
   }
}

GET /product/_search?search_type=count
{
  "aggs": {
    "byType": {
      "terms": {
        "field": "attributes.type",
        "size": 10
      }
    }
  }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM