[英]How do I aggregate slightly different data in Elasticsearch?
有一个请求,您可以使用它计算到端点/api/v1/blabla的请求持续时间的百分位数
POST /filebeat-nginx-*/_search
{
"aggs": {
"hosts": {
"terms": {
"field": "host.name",
"size": 1000
},
"aggs": {
"url": {
"terms": {
"field": "nginx.access.url",
"size": 1000
},
"aggs": {
"time_duration_percentiles": {
"percentiles": {
"field": "nginx.access.time_duration",
"percents": [
50,
90
],
"keyed": true
}
}
}
}
}
}
},
"size": 0,
"query": {
"bool": {
"filter": [
{
"bool": {
"should": [
{
"prefix": {
"nginx.access.url": "/api/v1/blabla"
}
}
]
}
},
{
"range": {
"@timestamp": {
"gte": "now-10m",
"lte": "now"
}
}
}
]
}
}
}
有一些 arguments 也传递到此端点的事实存在问题,例如/api/v1/blabla?Lang=en&type=active或/api/v1/blabla/?Lang=en&type=istory等。因此,答案显示了每个这样的“单独”端点的百分位数:
{
"key" : "/api/v1/blabla?lang=ru",
"doc_count" : 423,
"time_duration_percentiles" : {
"values" : {
"50.0" : 0.21199999749660492,
"90.0" : 0.29839999079704277
}
}
},
{
"key" : "/api/v1/blabla?lang=en&type=active",
"doc_count" : 31,
"time_duration_percentiles" : {
"values" : {
"50.0" : 0.21699999272823334,
"90.0" : 0.2510000020265579
}
}
},
{
"key" : "/api/v1/blabla?lang=en",
"doc_count" : 4,
"time_duration_percentiles" : {
"values" : {
"50.0" : 0.22700000554323196,
"90.0" : 0.24899999797344208
}
}
}
请告诉我是否有可能以某种方式将相似的端点聚合到一个/api/v1/blabla并获得一般百分位数?
像这样:
{
"key" : "/api/v1/blabla",
"doc_count" : 4,
"time_duration_percentiles" : {
"values" : {
"50.0" : 0.22700000554323196,
"90.0" : 0.24899999797344208
}
}
}
您可以尝试在脚本中拆分nginx.access.url
但请记住它可能会很慢:
{
"aggs": {
"hosts": {
"terms": {
"field": "host.name",
"size": 1000
},
"aggs": {
"url": {
"terms": {
"script": {
"source": "/\\?/.split(doc['nginx.access.url'].value)[0]" <--- here
},
"size": 1000
},
"aggs": {
"time_duration_percentiles": {
"percentiles": {
"field": "nginx.access.time_duration",
"percents": [
50,
90
],
"keyed": true
}
}
}
}
}
}
},
...
}
顺便说一句,在索引文档之前提取 URI 主机名、路径、查询字符串等是一种很好的做法。 您可以通过URI 部分管道和其他机制来执行此操作。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.