简体   繁体   English

使用Kafka主题存储多年数据

[英]Use Kafka topics to store data for many years

I am looking for a way of collecting metrics data from multiple devices.我正在寻找一种从多个设备收集指标数据的方法。 The data should be aggregated by multiple "group by" like functions.数据应该由多个“分组依据”之类的函数聚合。 The aggregation functions list is not complete and new aggregations will be added later and it will be required to aggregate all data collected from first days.聚合函数列表不完整,稍后将添加新的聚合,并且需要聚合从第一天开始收集的所有数据。

Is it fine to create Kafka topic with 100 year expiration period and use it as a datastore for this purpose?为此目的,创建具有 100 年有效期的 Kafka 主题并将其用作数据存储是否可以? So new aggregations will be able to read from topic's start while existing aggregations will continue from their's offsets?那么新的聚合将能够从主题的开始读取,而现有的聚合将继续从它们的偏移量开始?

In principle, yes you can use Kafka for long-term storage, exactly for the reason you outline - reprocessing of source data to derive additional aggregates/calculations.原则上,是的,您可以使用 Kafka 进行长期存储,这正是您概述的原因 - 重新处理源数据以获得额外的聚合/计算。

A couple of references:几个参考:

Yes if you want to keep the data you can just increase the retention time to a large value.是的,如果您想保留数据,您可以将保留时间增加到一个较大的值。

I'd still recommend having a retention policy on size to ensure you don't run out of disk space我仍然建议对大小制定保留政策,以确保您不会耗尽磁盘空间

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM