简体   繁体   中英

ksqlDB for finding average last hour, and store results back to a kafka topic?

We have a readpanda (kafka compatible) source, with sensor data. Can we do the following:

  1. Every hour, find the average sensor data last hour for each sensor
  2. Store them back to a topic

You want to create a materialized view over the stream of events that can be queried by other applications. Your source publishes the individual events to Kafka/Redpanda, another process observers the events and makes them available as queryable "tables" for other applications. Elaborating a few options:

KSQLdb is likely a default choice as it comes as "native" in the Kafka/Confluent stack. Be careful with using it over your production Kafka cluster. It has heavy impact on the cluster performance. See the basic tutorial or the advanced tutorial .

Use an out-of-the box solution for materialized views such as Materialize. It's easiest to setup/use and doesn't stress the Kafka broker. However, it is single-node only as of now (06/2022). See the tutorial .

Another popular option is using a stream processor and store hourly aggregates to an attached database (for example Flink storing data to Redis). This is a do-it-yourself approach. Have a look on Hazelcast . It is one process running both stream processing services and a queryable store.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM