简体   繁体   English

如何创建一个源以将指标从 Spark 导出到另一个接收器(Prometheus)?

[英]How to create a source to export metrics from Spark to another sink (Prometheus)?

I am trying to create a source for metrics from my spark application written in Scala to export data to another system, preferable to Prometheus.我正在尝试从用 Scala 编写的 spark 应用程序创建指标源,以将数据导出到另一个系统,比 Prometheus 更可取。 According to this site from Data bricks I need to create a source that extends the Source trait.根据来自数据砖的这个站点,我需要创建一个扩展 Source 特征的源。 However, the Source trait is private[spark] trait Source and my source cannot visualize it.但是,Source trait 是private[spark] trait Source ,我的源无法将其可视化。 When I create this class I get the error Symbol Source is inaccessible from this place .当我创建这个 class 时,我收到错误Symbol Source is inaccessible from this place

package org.sense.spark.util

import org.apache.spark.metrics.source.Source
import com.codahale.metrics.{Counter, Histogram, MetricRegistry}

class MetricSource extends Source {
  override val sourceName: String = "MySource"

  override val metricRegistry: MetricRegistry = new MetricRegistry

  val FOO: Histogram = metricRegistry.histogram(MetricRegistry.name("fooHistory"))
  val FOO_COUNTER: Counter = metricRegistry.counter(MetricRegistry.name("fooCounter"))
}

How can I create my source to export data to Prometheus?如何创建源以将数据导出到 Prometheus? I would like to export monitored values from a UDF inside the combineByKey transformation.我想从combineByKey转换中的 UDF 导出监控值。 The values would be latency to aggregate and throughput IN/OUT of this transformation.这些值将是此转换的聚合延迟和吞吐量输入/输出。

This is my build.sbt file in case it is necessary to check the libraries that I am using.这是我的build.sbt文件,以防有必要检查我正在使用的库。

name := "explore-spark"

version := "0.2"

scalaVersion := "2.12.3"

val sparkVersion = "3.0.0"

libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % sparkVersion,
  "org.apache.spark" %% "spark-streaming" % sparkVersion % "provided",
  "org.apache.spark" %% "spark-sql" % sparkVersion % "provided",
  "com.twitter" %% "algebird-core" % "0.13.7",
  "joda-time" % "joda-time" % "2.5",
  "org.fusesource.mqtt-client" % "mqtt-client" % "1.16"
)

mainClass in(Compile, packageBin) := Some("org.sense.spark.app.App")
mainClass in assembly := Some("org.sense.spark.app.App")

assemblyOption in assembly := (assemblyOption in assembly).value.copy(includeScala = false)
assemblyJarName in assembly := s"${name.value}_${scalaBinaryVersion.value}-fat_${version.value}.jar"

You will need to put your class which extends Source in the same package as source您需要将扩展 Source 的 class 放在与源相同的 package 中

package org.apache.spark.metrics.source

import com.codahale.metrics.{Counter, Histogram, MetricRegistry}

class MetricsSource extends Source {
  override val sourceName: String = "MySource"

  override val metricRegistry: MetricRegistry = new MetricRegistry

    val FOO: Histogram = metricRegistry.histogram(MetricRegistry.name("fooHistory"))

  val FOO_COUNTER: Counter = metricRegistry.counter(MetricRegistry.name("fooCounter"))
}

if in case you need to use prometheus sink instead of console sink, you can use third party library which was written for spark prometheus sink.如果您需要使用 prometheus sink 而不是控制台 sink,您可以使用为 spark prometheus sink 编写的第三方库。 this works via pushgateway- https://github.com/banzaicloud/spark-metrics/blob/master/PrometheusSink.md这通过 pushgateway -https://github.com/banzaicloud/spark-metrics/blob/master/PrometheusSink.md工作

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM