简体   繁体   English

无法将 Flink 指标公开给 Prometheus

[英]Can't expose Flink metrics to Prometheus

I'm trying to expose the built-in metrics of Flink to Prometheus, but somehow Prometheus doesn't recognize the targets - both the JMX as well as the PrometheusReporter .我试图将 Flink 的内置指标公开给 Prometheus,但不知何故 Prometheus 无法识别目标 - JMXPrometheusReporter

The scraping defined in prometheus.yml looks like this: prometheus.yml定义的抓取如下所示:

scrape_configs:
  - job_name: node
    static_configs:
      - targets: ['localhost:9100']

  - job_name: 'kafka-server'
    static_configs:
      - targets: ['localhost:7071']

  - job_name: 'flink-jmx'
    static_configs:
      - targets: ['localhost:8789']

  - job_name: 'flink-prom'
    static_configs:
      - targets: ['localhost:9249']

And my flink-conf.yml has the following lines:我的flink-conf.yml有以下几行:

#metrics.reporters: jmx, prom
metrics.reporters: jmx, prometheus

#metrics.reporter.jmx.factory.class: org.apache.flink.metrics.jmx.JMXReporterFactory
metrics.reporter.jmx.class: org.apache.flink.metrics.jmx.JMXReporter
metrics.reporter.jmx.port: 8789

metrics.reporter.prom.class: org.apache.flink.metrics.prometheus.PrometheusReporter
metrics.reporter.prom.port: 9249

However, both Flink targets are down when running a WordCount但是,在运行WordCount时,两个 Flink 目标都关闭了

  • in IntelliJ在 IntelliJ 中
  • as jar: java -jar target/flink-word-count.jar --input src/main/resources/loremipsum.txt作为 jar: java -jar target/flink-word-count.jar --input src/main/resources/loremipsum.txt
  • as Flink job: flink run target/flink-word-count.jar --input src/main/resources/loremipsum.txt作为 Flink 作业: flink run target/flink-word-count.jar --input src/main/resources/loremipsum.txt

According to the Flink docs I don't need any additional dependencies for JMX and a copy of the provided flink-metrics-prometheus-1.10.0.jar in flink/lib/ for the Prometheus reporter.根据 Flink 文档,我不需要 JMX 的任何额外依赖项以及为 Prometheus 报告器提供的flink/lib/提供的flink-metrics-prometheus-1.10.0.jarflink/lib/

What am I doing wrong?我究竟做错了什么? What is missing?有什么不见了?

That particular job is going to run to completion pretty quickly, I believe.我相信,那项特定的工作将很快完成。 Once you get the setup working there may be no interesting metrics because the job doesn't run long enough for anything to show up.一旦设置工作,可能没有有趣的指标,因为作业运行的时间不够长,无法显示任何内容。

When you run with a mini-cluster (as java -jar ... ), the flink-conf.yaml file isn't loaded (unless you've done something rather special in your job to get it loaded).当您使用迷你集群(如java -jar ... )运行时,不会加载flink-conf.yaml文件(除非您在工作中做了一些特别的事情来加载它)。 Note also that this file is normally has a .yaml extension;另请注意,此文件通常具有.yaml扩展名; I'm not sure if it works if .yml is used instead.如果使用.yml代替,我不确定它是否.yml

You can check the jog manager and task manager logs to make sure that the reporters are being loaded.您可以检查点动管理器和任务管理器日志以确保正在加载报告器。

FWIW, the last time I did this I used this setup, so that I could scrape from multiple processes: FWIW,上次我这样做时我使用了这个设置,这样我就可以从多个进程中抓取:

# flink-conf.yaml

metrics.reporters: prom
metrics.reporter.prom.class: org.apache.flink.metrics.prometheus.PrometheusReporter
metrics.reporter.prom.port: 9250-9260
# prometheus.yml

scrape_configs:
  - job_name: 'flink'
    static_configs:
      - targets: ['localhost:9250', 'localhost:9251']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM