简体   繁体   中英

is there any good way to monitor apache beam dataflow job pipeline state?

We have a dataflow job which we want to monitor using StatsDClient, so we want to send a metrics from Dataflow job to our telegraf through StatsDClient to get heart beat of the dataflow job inorder to determine whether dataflow job is running or failed so that we can setup some alerts to it.

we tried initializing StatsDClient in main function and tried sending metrics by checking PipelineResult.getState() method, however this approach is not working for us

Instead of using the state from Dataflow jobs, you can use Cloud Monitoring :

  • Metric: Dataflow job Failed (for example)
  • Alerting policy based on this metric

The alert can be sent to a PubSub topic.

You can then develop the PubSub client of your choice that will consume the message from this topic (via subscription) and sent the element to your client.

Alerting policy:

There is a built in metric for Dataflow job failed status , you can create an alerting policy based on this metric:

在此处输入图像描述

Then configure a threshold:

在此处输入图像描述

If one Dataflow job failed, it will trigger an alert.

Notification channel:

For the alerting policy you can choose a Pub Sub topic.

在此处输入图像描述

For Dataflow job status (not only failed jobs), I saw the metric job/status GA in beta but I not used yet

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM