简体   繁体   English

elasticsearch 获取最近摄取的日期范围

[英]elasticsearch get date range of most recent ingestion

I have an elasticsearch index that gets new data in large dumps, so from looking at the graph its very obvious when new data is added.我有一个 elasticsearch 索引,它在大型转储中获取新数据,因此从图表中可以看出,添加新数据时非常明显。 在此处输入图像描述

If I only want to get data from the most recent ingestion (in this case data from 2020-08-06, whats the best way of doing this?如果我只想从最近的摄取中获取数据(在这种情况下是 2020-08-06 的数据,那么最好的方法是什么?

I can use this query to get the most recent document:我可以使用此查询来获取最新的文档:

GET /indexname/_search
{ 
      "query": {
        "bool": {
          "must": [
            {
              "query_string": {
                "query": queryString
              }
            }
          ]
        }
      },
      "sort": { 
        "@timestamp" : "desc" 
      }, 
      "size": 1 
    }

Which will return the most recent document, in this case a document with a timestamp of 2020-08-06. I can set that to my endDate and set my startDate to that date minus one day, but im worried of cases where the data was ingested overnight and spanned two days.

I could keep making requests to go back in time 5 hours at a time to find when the most recent large gap is, but im worried that making a request in a for loop could be time consuming?我可以继续向 go 发出请求,一次 5 小时,以找到最近的大差距是什么时候,但我担心在 for 循环中发出请求可能会很耗时? Is there a smarter way for getting the date range of my most recent ingestion?thx有没有更聪明的方法来获取我最近摄取的日期范围?thx

When your data is coming in batches it'd be best to attribute an identifier to each batch.当您的数据分批传入时,最好为每个批次分配一个标识符。 That way, there's no date math required.这样,就不需要日期数学。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM