简体   繁体   中英

Flink Metrics name collision

My Flink (1.6) job listens to a stream and performs some aggregation. I want to collect metrics after the aggregation but am having some difficulties.

My metrics look like this:

id_1, 0.1
id_2, 0.3
...

The ids will be variable and the values will increase and decrease over time so it looked like a Gauge was most appropriate.

I created this map function to capture these metrics in a gauge:

class MetricsMapper extends RichMapFunction[MyObject, Double] {
  override def map(obj: MyObject): Double = {
    val metricVal = obj.metricVal
    getRuntimeContext.getMetricGroup.gauge[Double, ScalaGauge[Double]](obj.id, ScalaGauge[Double](() => metricVal))
    metricVal
  }
}

As this shows, I'm using the id property of my object to register the gauge.

The problem I am having is that I receive this warning when I run the job:

Name collision: Group already contains a Metric with the name "x" Metric will not be reported

I interpret this as we have already created this gauge earlier in the stream and the new value is ignored. Is there a way to overcome this?

Thanks

Are you sure you want to use metrics here? Metrics are usually used as a means to look at how the job is performing. Usual values that you want to use metrics for are:

  • records per seconds,
  • late events
  • number of corrupted events etc.

In your case I would rather go with some side pipeline producing those aggregates.

You should be following the pattern shown in the documentation :

new class MyMapper extends RichMapFunction[MyObject, Double] {
  @transient private var valueToExpose = 0.0

  override def open(parameters: Configuration): Unit = {
    getRuntimeContext()
      .getMetricGroup()
      .gauge[Double, ScalaGauge[Double]]("MyGauge", ScalaGauge[Double]( () => valueToExpose ) )
  }

  override def map(obj: MyObject): String = {
    valueToExpose = obj.metricval
    valueToExpose
  }
}

In other words, register the gauge once in the open() method, and update the value each time map() is called.

In your case you are wanting a separate gauge for each unique object id. If you really want to do this with metrics, you'll have to keep around something like a hashmap of gauges, creating new ones as needed, and updating the value of the relevant gauge in the map() function. Or better, key your stream by the id.

Another factor to keep in mind when considering whether using metrics is appropriate is that metrics are not checkpointed.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM