简体   繁体   English

检测 Kafka 集群已关闭的最佳方法是什么?

[英]What is the best way to detect that a Kafka cluster is down?

The Kafka Consumer API is so nice to hide any transient connection errors and just pick up reading from it's current offset if a Kafka broker dies and comes up again. Kafka 消费者 API 非常好,可以隐藏任何暂时的连接错误,如果 Kafka 代理死亡并再次出现,只需从当前偏移量中读取读数。

But in some applications it's important to alert and stop processing data (from other sources), if the entire Kafka cluster is down (ie all brokers).但是在某些应用程序中,如果整个 Kafka 集群(即所有代理)都关闭,则警报和停止处理数据(来自其他来源)很重要。 I've browsed the misc.我浏览了杂项。 APIs and that doesn't seem to be a feature. API,这似乎不是一个功能。

The closest I've come is to submit an Admin call and depending on a timeout, conclude that the Kafka cluster is down:我最接近的是提交一个管理员调用,并根据超时得出 Kafka 集群已关闭的结论:

Properties properties   = ... // Load properties from somewhere.
int timeout             = 5_000; // 5 second timeout
AdminClient adminClient = AdminClient.create(properties);
try {
    adminClient.listTopics(new ListTopicsOptions().timeoutMs(timeout)).listings().get();
    // Here we know the cluster is up as call returned within timeout.
} catch (ExecutionException ex) {
    // Here we know that the cluster is down as the call timed out.
}

Is this the best way to do it?这是最好的方法吗?

Another way is to query ZooKeeper, but the above approach will also work in situations where there's a network problem between the application and Kafka.另一种方法是查询 ZooKeeper,但上述方法也适用于应用程序和 Kafka 之间存在网络问题的情况。

Your approach looks fine.你的方法看起来不错。 A similar approach (using Spring's HealthIndicator 's notion) is what MartinX3 did here .类似的方法(使用 Spring 的HealthIndicator的概念)是 MartinX3 在这里所做的。

His solution:他的解决方案:

@Component
public class KafkaHealthIndicator implements HealthIndicator {
    private final Logger log = LoggerFactory.getLogger(KafkaHealthIndicator.class);

    private KafkaTemplate<String, String> kafka;

    public KafkaHealthIndicator(KafkaTemplate<String, String> kafka) {
        this.kafka = kafka;
    }

    /**
     * Return an indication of health.
     *
     * @return the health for
     */
    @Override
    public Health health() {
        try {
            kafka.send("kafka-health-indicator", "❥").get(100, TimeUnit.MILLISECONDS);
        } catch (InterruptedException | ExecutionException | TimeoutException e) {
            return Health.down(e).build();
        }
        return Health.up().build();
    }
}

You might also want to combine other metric-checks , before returning Health.up().build() , such as ActiveControllerCount = 0 , depending on what you consider important for your use case to consider the entire cluster as down.您可能还希望在返回Health.up().build()之前组合其他metric-checks ,例如ActiveControllerCount = 0 ,具体取决于您认为对您的用例而言重要的是什么,以将整个集群视为已关闭。

Although I would suggest you to use a proper monitoring tool, in case you still want to do this programmatically, one option is to use AdminClient and try fetching the topic names.尽管我建议您使用适当的监控工具,但如果您仍想以编程方式执行此操作,一种选择是使用AdminClient并尝试获取主题名称。


For example,例如,

Properties properties = new Properties();
properties.put("bootstrap.servers", "localhost:9092");
properties.put("request.timeout.ms", 5000);

try {

    AdminClient adminClient = AdminClient.create(properties)

    ListTopicsResult topics = adminClient.listTopics();
    Set<String> names = topics.names().get();

} catch(InterruptedException | ExecutionException e) {
    System.err.println("Kafka is unavailable");
}

Note however that the above won't throw an exception if some of the brokers are down (but obviously, if a broker is down it doesn't mean that the Kafka Cluster itself is down, as data should still be accessible)但是请注意,如果某些代理关闭,上述内容不会引发异常(但显然,如果代理关闭并不意味着 Kafka 集群本身已关闭,因为数据应该仍然可以访问)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在tomcat集群中协调工人的最佳方法是什么? - What is the best way for coordinating workers in tomcat cluster? 唯一检测客户的最佳方法是什么? - what is the best way to detect a client uniquely? 使用 KafkaProperties 配置多个 kafka 主题名称的最佳方法是什么 - What is the best way to configure multiple kafka topic names using KafkaProperties 群集的一个Kafka代理关闭时发生ConnectException - ConnectException when one Kafka broker of cluster is down 在 android camerax 图像分析中检测文档角的最佳方法是什么? - what is the best way to detect document corners in android camerax Image analysis? 在无限滚动2D游戏中检测碰撞的最佳方法是什么? - What is the best way to detect collisions in an infinite scroll 2D game? 检测对象是否保存在javafx桌面应用程序中的最佳方法是什么? - What is the best way to detect whether the objects are saved in javafx desktop application? 检测存储在Java对象中的原语的最佳方法是什么? - What is the best way to detect a primitive stored in a Java Object? 检测Java API调用是否成功完成的最佳方法是什么 - What is the best way to detect whether a java API call successfully completed or not 检测Webstart是否启动应用程序的最佳方法是什么? - What is the best way to detect whether an application is launched by Webstart
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM