简体   繁体   中英

Monitoring pub/sub services

For every service which read/write from/to topic in Kafka/Redis, there are some basic metrics which we want to have in Prometheus:

  1. How "fast" the writes are for every topic
  2. How "fast" the reads are for every topic
    • In Kafka, I may want to determine how "fast" each group-id reads.

To determine the "speed" of reading from a topic, one can think of a mechanism where someone publish the same message in intervals of 10 seconds and the consumer sends to Prometheus when it fully processed that message. If the graph show that the message was read every 12 seconds, it means that we have a lag of 2 seconds when reading any messages.

It looks like a lot of repeated manual work on every topic there is on the system.

Question

Is my proposal makes any sense? Are there any best-practice/tools on how to determine "lags"/"speed" of reading/writing from every topic in redis/kafka/... in Prometheus?

I had the exact same issue once.

Maintaining the each topic metrics manually is very tiring and not at all scalable.

I switched over to using kafka_consumergroup_lag metric from the kafka_exporter This along with the consumergroup,topic labels were enough to let us know to know which topic was not being read/lagging behind and by which consumer group.

Also has other metrics like the rate of meassages being read.

As for converting this lag in terms of time, either attach an produce time to kafka message and read it at the other end of the kafka pipeline and export the difference in time via micrometer from the application to Prometheus.

Or better still:- use tracing to track each message in the piepline using OpenTracing tools like Jaeger

Use this for Redis monitoring.

All these exporters send the data in the Prometheus format and can be directly integrated.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM