简体   繁体   English

如何获取失败的 flink 作业的名称

[英]How to get the name of failed flink jobs

Our flink cluster sometimes restarts and all jobs will be restarted.我们的 flink 集群有时会重新启动,所有作业都会重新启动。 Occasionally, some job failed to restart and failed count increases on the panel.有时,某些作业无法重新启动,并且面板上的失败计数会增加。 However, it cannot let us know which jobs failed.但是,它无法让我们知道哪些作业失败了。

When total job count grows, it becomes harder to find out the stopped job.当总作业数增加时,找到停止的作业变得更加困难。 Does anyone know how can I get the names of the failed jobs?有谁知道我怎样才能得到失败工作的名字? 在此处输入图片说明

You could write a simple script for that which will give you the list of job names which have failed.您可以为此编写一个简单的脚本,该脚本将为您提供失败的作业名称列表。

I am using this command to get a list of failed job.我正在使用此命令来获取失败作业的列表。

$yarn application -list -appStates KILLED

Set up alert when your cluster restarts and post restart check the jobs that haven't restarted and you could have alerts for those as well.在集群重新启动时设置警报并在重新启动后检查尚未重新启动的作业,您也可以为这些作业发出警报。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM