繁体   English   中英

Flink 作业因“检查点协调器正在暂停”而失败。

[英]Flink job failed with "Checkpoint Coordinator is suspending."

我运行了一个 flink 作业,但在 18 小时后失败了。 失败消息:检查点协调器正在挂起。

检查点截图:检查点截图

工作概览截图:作业概览截图

这是作业管理器日志:

2020-10-10 13:53:10,636 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Source: kafkaSource -> Timestamps/Watermarks -> Process (1/1) (c38419ece8208c1ef2948087f2b84dd0) switched from RUNNING to FAILED.
java.lang.Exception: Could not perform checkpoint 3265 for operator Source: kafkaSource -> Timestamps/Watermarks -> Process (1/1).
    at org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpoint(StreamTask.java:785)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$triggerCheckpointAsync$3(StreamTask.java:760)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.run(StreamTaskActionExecutor.java:87)
    at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:78)
    at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:261)
    at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:186)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:485)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:469)
    at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:708)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:533)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
    at org.apache.flink.streaming.runtime.tasks.StreamTask$CheckpointingOperation.executeCheckpointing(StreamTask.java:1394)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.checkpointState(StreamTask.java:974)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$performCheckpoint$5(StreamTask.java:870)
    at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:94)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:843)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpoint(StreamTask.java:776)
    ... 11 more

我不知道是什么导致了这个问题。

当您的应用程序尝试检查点时可能会发生这种情况,并且此时检查点协调器(作业管理器)由于某种原因关闭,并且检查点无法完成。 关闭的原因可能有多种原因,例如,您开始了新的部署、取消了作业、作业由于某些运行时异常而不得不退出等。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM