繁体   English   中英

ElasticSearch聚合-获取时间序列中最大直方图值的确切时间

[英]ElasticSearch aggregation - get the exact time of a max histogram value in a timeseries

我是Elasticsearch的新手,如果这是一个小问题,请您道歉。

我有一个时间序列,每隔n秒要进行不规则更新,这是我想绘制的历史记录。 数据包含一个名为“ score”的长变量,以及一个长变量,每个“ score”的纪元为“ time”作为时间戳。

为了减少长时程图(例如,全年)中的点数,我想汇总256个存储桶中的数据,并为每个存储桶使用最大“得分”值; 但是,我需要保留每个分数的原始时间戳,而不是存储桶的开始。

我设法通过运行以下查询来获取存储桶:

 curl -XGET 'http://localhost:9200/localhost.localdomain/SET_APPS/_search' -d' { "query" : { "range" : { "time" : { "from" : 1429010378445, "to" : 1431602378445, "include_lower" : true, "include_upper" : true } } }, "aggregations" : { "time_hist" : { "histogram" : { "field" : "time", "interval" : 10125000, "order" : { "_count" : "asc" }, "min_doc_count" : 0, "extended_bounds" : { "min" : 1429010378445, "max" : 1431602378445 } }, "aggregations" : { "max_score" : { "max" : { "field" : "score" } } } } } } }' 

但是,我只获取存储桶的时间戳,而我需要比分的原始时间:

 { "took": 8, "timed_out": false, "_shards": { "total": 5, "successful": 4, "failed": 1, "failures": [{ "index": "localhost.localdomain", "shard": 2, "status": 500, "reason ": "QueryPhaseExecutionException[[localhost.localdomain][2]: query[filtered(time:[1429010378445 TO 1431602378445])->cache(_type:SET_APPS)],from[0],size [10]: Query Failed [Failed to execute main query]]; nested: IllegalStateException[unexpected docvalues type NONE for field 'score' (expected one of [S ORTED_NUMERIC, NUMERIC]). Use UninvertingReader or index with docvalues.]; " }] }, "hits": { "total": 2018, "max_score": 1.0, "hits": [{ "_index": "localhost.localdomain", "_type": "SET_APPS", "_id": "AU09dUBR80Hb_Fungv_r", "_score": 1.0, "_source": { time: 1431255203918, score: 6027 } }, { "_index": "localhost.localdomain", "_type": "SET_APPS", "_id": "AU09c7MS80Hb_Fungv_X", "_score": 1.0, "_source": { time: 1431255102221, score: 5518, } } .... ] }, "aggregations": { "time_hist": { "buckets": [{ "key": 1429002000000, "doc_count": 0, "max_score": { "value": null } }, ...... { "key": 1431249750000, "doc_count": 215, "max_score": { "value": 8564.0, "value_as_string": "8564.0" } }, { "key": 1431280125000, "doc_count": 228, "max_score": { "value": 18602.0, "value_as_string": "18602.0" } }, { "key": 1431259875000, "doc_count": 658, "max_score": { "value": 17996.0, "value_as_string": "17996.0" } }, { "key": 1431270000000, "doc_count": 917, "max_score": { "value": 17995.0, "value_as_string": "17995.0" } }] } } } 

在上面的结果中,如果我们专门查询分数18602,我们将获得真实的时间戳记:

 $ curl -XGET 'http://localhost:9200/localhost.localdomain/SET_APPS/_search' -d' { "fields": [ "time", "score" ], "query" : { "term": { "score": "18602" } } }' {"took":3,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":1.0,"hits":[{"_index":"localhost.localdomain ","_type":"SET_APPS","_id":"AU0-90Vsi-vs_2ajcYu-","_score":1.0,"fields":{"score":[18602],"time":[1431280502124]}}]}} 

任何帮助表示赞赏!

我想我找到了解决方案:

 $ curl -XGET 'http://localhost:9200/localhost.localdomain/SET_APPS/_search?pretty=true' -d' { "size":0, "query" : { "constant_score" : { "filter" : { "range" : { "time" : { "gte" : 1457868375000, "lt" : 1460460375000 } } } } }, "aggregations" : { "time_hist" : { "histogram" : { "field" : "time", "interval" : 10125000, "order" : { "_count" : "asc" }, "min_doc_count" : 0, "extended_bounds" : { "min" : 1429010378445, "max" : 1431602378445 } }, "aggregations" : { "max_time": { "terms": { "field":"time", "order" : { "max_score": "desc" }, "size":1 }, "aggregations":{ "max_score" : { "max" : { "field" : "score" } } } } } } } } }' > foo 

这似乎产生了预期的效果:

 ... "aggregations" : { "time_hist" : { "buckets" : [ { "key" : 1429002000000, "doc_count" : 0, "max_time" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ ] } }, { "key" : 1429012125000, "doc_count" : 0, "max_time" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ ] } }, ... { "key" : 1431249750000, "doc_count" : 270, "max_time" : { "doc_count_error_upper_bound" : -1, "sum_other_doc_count" : 269, "buckets" : [ { "key" : 1431255810484, "doc_count" : 1, "max_score" : { "value" : 8564.0, "value_as_string" : "8564.0" } } ] } }, { "key" : 1431280125000, "doc_count" : 285, "max_time" : { "doc_count_error_upper_bound" : -1, "sum_other_doc_count" : 284, "buckets" : [ { "key" : 1431280502124, "doc_count" : 1, "max_score" : { "value" : 18602.0, "value_as_string" : "18602.0" } } ] } }, { "key" : 1431259875000, "doc_count" : 821, "max_time" : { "doc_count_error_upper_bound" : -1, "sum_other_doc_count" : 820, "buckets" : [ { "key" : 1431269132642, "doc_count" : 1, "max_score" : { "value" : 17996.0, "value_as_string" : "17996.0" } } ] } }, { "key" : 1431270000000, "doc_count" : 1155, "max_time" : { "doc_count_error_upper_bound" : -1, "sum_other_doc_count" : 1154, "buckets" : [ { "key" : 1431278681884, "doc_count" : 1, "max_score" : { "value" : 17995.0, "value_as_string" : "17995.0" } } ] } } ] } } } 

这是产生此代码的Java代码...

public synchronized List<Pair<Long, Long>> 
getScores(Calendar start, Calendar finish, int maxUniqueScoreEntries) 
throws IOException 
{
  List<Pair<Long, Long>> retVal = new ArrayList<>(maxUniqueScoreEntries);
  try
  {
    long startTimeMs = start.getTimeInMillis();
    long finishTimeMs = finish.getTimeInMillis();

    Pair<Long, Long> firstVal = new Pair<Long, Long>(start.getTimeInMillis(), 0L);
    retVal.add(firstVal);

    SearchRequestBuilder srb = client.prepareSearch()
      .setIndices(solutionName)
      .setTypes(ThreadMgrWebSocketsSvc.Subprotocols.SET_APPS.toString())
    .setQuery(QueryBuilders.rangeQuery("time").from(startTimeMs).to(finishTimeMs))
    .addAggregation(
            AggregationBuilders.histogram("time_hist").minDocCount(0).field("time").order(Order.COUNT_ASC)
              .extendedBounds(startTimeMs, finishTimeMs)
              .interval((finishTimeMs - startTimeMs) / maxUniqueScoreEntries)
              .subAggregation(
                AggregationBuilders.terms("max_time")
                .field("time")
                .order(Terms.Order.aggregation("max_score", false))
                .size(1)
                .subAggregation(
                  AggregationBuilders.max("max_score").field("score"))
              )
          );

      SearchResponse sr = srb.execute().actionGet();

      Histogram timeHist = sr.getAggregations().get("time_hist");
      List<? extends Bucket> timeHistBuckets = timeHist.getBuckets();
      for (int i = 0, len = timeHistBuckets.size(); i < len; i++)
      {
        Long epochTime = null;
        Long maxScore = null;

        Histogram.Bucket maxScoreBucket = timeHistBuckets.get(i);

        Terms maxTimeTermAgg = maxScoreBucket.getAggregations().get("max_time");

        List<Terms.Bucket> buckets = maxTimeTermAgg.getBuckets();

        for (int j = 0, jlen = buckets.size(); j < jlen; j++)
        {
          Terms.Bucket bucket = buckets.get(j);

          epochTime = bucket.getKeyAsNumber().longValue();
          Aggregation agg = bucket.getAggregations().get("max_score");

          if (agg instanceof Max)
          {
            double value = ((Max) agg).getValue();
            if (value > 0)
            {
              maxScore = (long) ((value > 0) ? value : 0);
            }

          }

        }

        if (epochTime != null && maxScore != null)
        {
          System.out.printf(" %d - Date = %s; rawTime = %d ; val = %d\n", i, new DateTime(epochTime).toString(),
          epochTime, maxScore);

          Pair<Long, Long> val = new Pair<>(epochTime, maxScore);
          retVal.add(val);

        }

      }


    System.out.printf("query was %s, %s \n", new DateTime(startTimeMs).toString(),
      new DateTime(finishTimeMs).toString());

    Pair<Long, Long> last = retVal.get(retVal.size() - 1);
    if (last.getSecond().longValue() != finish.getTimeInMillis())
    {
      Pair<Long, Long> endVal = new Pair<Long, Long>(finish.getTimeInMillis(), 0L);
      retVal.add(endVal);
    }
  }
  catch (Exception e)
  {
    retVal.add(new Pair<Long, Long>(start.getTimeInMillis(), 0L));
    retVal.add(new Pair<Long, Long>(finish.getTimeInMillis(), 0L));

  }

  Collections.sort(retVal);



  return retVal;
}    

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM