简体   繁体   English

Kafka Streams Shutdown Hook 和同一 Stream 应用程序中的意外异常处理

[英]Kafka Streams Shutdown Hook and Unexpected Exception Handling in the same Stream application

I was tasked with tearing down a Dev environment and setting it up again from scrap to verify our CI-CD processes;我的任务是拆除一个 Dev 环境并从废料中重新设置它以验证我们的 CI-CD 流程; only problem was that I messed up creating one topic and so the Kafka Streams application exited with an error.唯一的问题是我搞砸了创建一个主题,因此 Kafka Streams 应用程序退出并出现错误。

I dug into it and found the issue and corrected it but as I was digging in I ran into another odd little problem.我深入研究并发现了问题并进行了纠正,但在深入研究时,我遇到了另一个奇怪的小问题。

I implemented a Unexpected Exception handler as so:我实现了一个意外异常处理程序,如下所示:

streams.setUncaughtExceptionHandler((t, e) -> {
    logger.fatal("Caught unhandled Kafka Streams Exception:", e);
    // Do some exception handling.
    streams.close();

    // Maybe do some more exception handling.
    // Open a lock that is waiting after streams.start() call 
    // to let application exit normally
    shutdownLatch.countDown();
});

Problem is that if the application threw an exception because of a topic error when the KafkaStreams::close is call the application seems to dead lock in WindowsSelectorImpl::poll after attempting a call to KafkaStreams::waitOnState.问题是,如果应用程序在调用 KafkaStreams::close 时由于主题错误引发异常,则应用程序在尝试调用 KafkaStreams::waitOnState 后似乎在 WindowsSelectorImpl::poll 中死锁。

I thought it might be an issue with calling KafkaStreams::close inside the Exception Handler but I found this SO and a comment from Matthias J. Sax that said it should be ok to call KafkaStreams::Close in the exception handler with the caveat of not calling KafkaStreams::close from multiple threads.我想这可能是与调用KafkaStreams ::接近异常处理程序内部的问题,但我发现这个SO和评论马蒂亚斯J.萨克斯认为说应该是确定调用KafkaStreams ::关闭在异常处理程序与告诫不从多个线程调用 KafkaStreams::close。

The issue is that I want to implement a shutdown hook to kill the steams application gracefully on request, as well as implement the UnexpectedException handler to clean up and terminate gracefully in the event of exceptions.问题是我想实现一个关闭钩子来根据请求优雅地终止 Steams 应用程序,并实现 UnexpectedException 处理程序以在发生异常时优雅地清理和终止。

I came up with the following solution that checks the KafkaStreams state before calling close and it does actually work, but it seems a bit iffy since I could see other cases besides running (perhaps Pending) where we would want to ensure the KafkaStreams::close it called.我想出了以下解决方案,在调用 close 之前检查 KafkaStreams 状态并且它确实有效,但它似乎有点不确定,因为我可以看到除了运行(可能是 Pending)之外的其他情况,我们希望确保 KafkaStreams::close它叫。

Runtime.getRuntime().addShutdownHook(new Thread(() -> {
    logger.fatal("Caught Shutdown request");
    // Do some shutdown cleanup.
    if (streams.state().isRunning())
    {
        If this hook is called due to the Main exiting after handling 
        an exception we don't want to call close again. It doesn't 
        cause any errors but logs that the application was closed 
        a second time.
        streams.close(100L, TimeUnit.MILLISECONDS);
    }
    // Maybe do a little bit more clean up before system exits.
    System.exit(0);

}));

streams.setUncaughtExceptionHandler((t, e) -> {
    logger.fatal("Caught unhandled Kafka Streams Exception:", e);
    // Do some exception handling.
    if (streams.state().isRunning())
    {
        streams.close(100L, TimeUnit.MILLISECONDS);
    }
    // Maybe do some more exception handling.

    // Open the Gate to let application exit normally
    shutdownLatch.countDown();
    // Or Optionally call halt to immediately terminate and prevent call to Shutdown hook.
    Runtime.getRuntime().halt(0);
});

Any suggestions about why calling the KafkaSteams:close in the exception handler would be causing such troubles or if there is a better way to implement shutdown hook and the exception handler at the same time it would be greatly appreciated?关于为什么在异常处理程序中调用 KafkaSteams:close 会导致此类麻烦的任何建议,或者如果有更好的方法同时实现关闭钩子和异常处理程序,将不胜感激?

Calling close() from an exception handler and from shutdown hook is slightly different.从异常处理程序和从关闭挂钩调用close()略有不同。 close() might deadlock if called from the shutdown hook (cf. https://issues.apache.org/jira/browse/KAFKA-4366 ) and thus, you should call it with a timeout.如果从关闭挂钩调用close()可能会死锁(参见https://issues.apache.org/jira/browse/KAFKA-4366 ),因此,您应该超时调用它。

Also, the issue is related with calling System.exit() from within an uncaught exception handler as described in the Jira.此外,该问题与从未捕获的异常处理程序中调用System.exit()有关,如 Jira 中所述。 In general, calling System.exit() is quite harsh and should be avoided IMHO.一般来说,调用System.exit()非常苛刻,恕我直言,应该避免。

Your solution seems not to be 100% robust, either, because streams.state().isRunning() could result in a race condition.您的解决方案似乎也不是 100% 稳健,因为streams.state().isRunning()可能会导致竞争条件。

An alternative to using a timeout might be to only set an AtomicBoolean within both the shutdown hook and the exception handler and use the "main()" thread to call close if the boolean flag is set to true:使用超时的替代方法可能是仅在关闭挂钩和异常处理程序中设置AtomicBoolean ,如果布尔标志设置为 true,则使用“main()”线程调用 close:

private final static AtomicBoolean stopStreams = new AtomicBoolean(false);

public static void main(String[] args) {
  // do stuff

  KafkaStreams streams = ...
  stream.setUncaughtExceptionHandler((t, e) -> {
    stopStreams.set(true);
  });

  Runtime.getRuntime().addShutdownHook(new Thread(() -> {
    stopStreams.set(true);
  });

  while (!stopStreams.get()) {
    Thread.sleep(1000);
  }
  streams.close();
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM