简体   繁体   English

将指标从 telegraf 发送到 prometheus

[英]Sending metrics from telegraf to prometheus

I'm running prometheus and telegraf on the same host.我在同一台主机上运行prometheustelegraf

I'm using a few inputs plugins:我正在使用一些输入插件:

  • inputs.cpu输入.cpu
  • inputs.ntpq输入.ntpq

I've configured to the prometheus_client output plugin to send data to prometheus我已经配置了prometheus_client output 插件来发送数据给 prometheus

Here's my config:这是我的配置:

    [[outputs.prometheus_client]]
      ## Address to listen on.
      listen = ":9126"

      ## Use HTTP Basic Authentication.
      # basic_username = "Foo"
      # basic_password = "Bar"

      ## If set, the IP Ranges which are allowed to access metrics.
      ##   ex: ip_range = ["192.168.0.0/24", "192.168.1.0/30"]
      # ip_range = []

      ## Path to publish the metrics on.
      path = "/metrics"

      ## Expiration interval for each metric. 0 == no expiration
      #expiration_interval = "0s"

      ## Collectors to enable, valid entries are "gocollector" and "process".
      ## If unset, both are enabled.
      # collectors_exclude = ["gocollector", "process"]

      ## Send string metrics as Prometheus labels.
      ## Unless set to false all string metrics will be sent as labels.
      # string_as_label = true

      ## If set, enable TLS with the given certificate.
      # tls_cert = "/etc/ssl/telegraf.crt"
      # tls_key = "/etc/ssl/telegraf.key"

      ## Export metric collection time.
      #export_timestamp = true

Here's my prometheus config

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ['localhost:9090']

#  - job_name: 'node_exporter'
#    scrape_interval: 5s
#    static_configs:
#      - targets: ['localhost:9100']

  - job_name: 'telegraf'
    scrape_interval: 5s
    static_configs:
      - targets: ['localhost:9126']

If i'm going to http://localhost:9090/metrics i don't see any metrics which are coming from telegraf.如果我要去http://localhost:9090/metrics我看不到任何来自 telegraf 的指标。

I've captured some logs from telegraf as well我也从 telegraf 捕获了一些日志

/opt telegraf --config /etc/telegraf/telegraf.conf --input-filter filestat --test ➜ /opt tail -F /var/log/telegraf/telegraf.log 2019-02-11T17:34:20Z D. [outputs.prometheus_client] wrote batch of 28 metrics in 1:234869ms 2019-02-11T17:34.20ZD: [outputs.prometheus_client] buffer fullness: 0 / 10000 metrics: 2019-02-11T17.34.30ZD: [outputs:file] wrote batch of 28 metrics in 384.672µs 2019-02-11T17:34.30ZD: [outputs:file] buffer fullness. 0 / 10000 metrics. 2019-02-11T17:34:30Z D. [outputs:prometheus_client] wrote batch of 30 metrics in 1.250605ms 2019-02-11T17:34:30Z D! [outputs.prometheus_client] buffer fullness: 9 / 10000 metrics.

I don't see an issue in the logs.我没有在日志中看到问题。

The /metrics endpoint of your Prometheus server exports metrics about the server itself, not metrics that it scraped from targets like the telgraf exporter. Prometheus 服务器的/metrics端点导出有关服务器本身的指标,而不是它从 telgraf 导出器等目标中抓取的指标。

Go to http://localhost:9090/targets , you should see a list of targets that your Prometheus server is scraping.转到http://localhost:9090/targets ,您应该会看到 Prometheus 服务器正在抓取的目标列表。 If configured correctly, the telegraf exporter should be one of them.如果配置正确,telegraf 导出器应该是其中之一。

To query Prometheus for telegraf exporter generated metrics, navigate your browser to http://localhost:9090/graph and enter eg cpu_time_user in the query field.要查询 Prometheus 以获取 Telegraf 导出器生成的指标,请将浏览器导航到http://localhost:9090/graph并在查询字段中输入例如cpu_time_user If the CPU plugin is enabled it should have that and more metrics.如果启用了 CPU 插件,它应该有更多的指标。

You should use the following Prometheus config file in order to scrape metrics exported by prometheus_client at Telegraf:您应该使用以下 Prometheus 配置文件来抓取 Telegraf 的prometheus_client导出的指标:

scrape_configs:
- job_name: telegraf
  static_configs:
  - targets:
    - "localhost:9126"

Path to this file must be passed to --config.file command-line flag when starting Prometheus.启动 Prometheus 时,必须将此文件的路径传递给--config.file命令行标志。

See more details about Prometheus config in these docs .这些文档中查看有关 Prometheus 配置的更多详细信息。

PS There is an alternative solution to push metrics collected by Telegraf directly to Prometheus-like system such as VictoriaMetrics instead of InfluxDB - see these docs . PS 有一种替代解决方案可以将 Telegraf 收集的指标直接推送到类似 Prometheus 的系统,例如VictoriaMetrics而不是 InfluxDB - 请参阅这些文档 Later these metrics can be queried with PromQL-compatible query language - MetricsQL .稍后可以使用 PromQL 兼容的查询语言 - MetricsQL查询这些指标。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM