簡體   English   中英

即使端點可達,Prometheus up 指標顯示為 0

[英]Prometheus up metric shows 0 even the endpoint is reachable

我有一個帶有 nginx 容器的簡單 pod,它在路徑/上返回healthy文本。 我有普羅米修斯在路徑/上刮取端口 80 。 當我在 prometheus 儀表板中運行up == 0 ,它顯示了這個 pod,這意味着這個 pod 不健康。 但是我嘗試 ssh 進入容器,它運行良好,我在 nginx 日志中看到 prometheus 正在 ping /並獲得 200 響應。 知道為什么嗎?

部署.yml

apiVersion: apps/v1
kind: Deployment
metadata:
  ...
spec:
  ...
  template:
    metadata:
      labels:
        ...
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/path: "/"
        prometheus.io/port: "80"
    spec:
      containers:
        - name: nginx
          image: nginx
          volumeMounts:
            - name: nginx-conf
              mountPath: /etc/nginx
              readOnly: true
          ports:
            - containerPort: 80
      volumes:
        - name: nginx-conf
          configMap:
            name: nginx-conf
            items:
              - key: nginx.conf
                path: nginx.conf


配置文件

apiVersion: v1
kind: ConfigMap
metadata:
  name: nginx-conf
data:
  nginx.conf: |
    http {
      server {
        listen 80;

        location / {
          return 200 'healthy\n';
        }
      }
    }

nginx 訪問日志

192.168.88.81 - - [xxx +0000] "GET / HTTP/1.1" 200 8 "-" "Prometheus/2.26.0"
192.168.88.81 - - [xxx +0000] "GET / HTTP/1.1" 200 8 "-" "Prometheus/2.26.0"
192.168.88.81 - - [xxx +0000] "GET / HTTP/1.1" 200 8 "-" "Prometheus/2.26.0"

當您將這些注釋配置到 pod 時,Prometheus 期望給定的路徑返回 Prometheus 可讀的指標。 但是'healthy\\n'不是有效的 Prometheus 指標類型。

      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/path: "/"
        prometheus.io/port: "80"

推薦修復:

  • 使用nginx-prometheus-exporter作為 sidecar。
  • 將 sidecar 信息添加到注釋中,以便 Prometheus 可以從中抓取指標。
apiVersion: apps/v1
kind: Deployment
metadata:
  ...
spec:
  ...
  template:
    metadata:
      labels:
        ...
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/path: "/metrics"
        prometheus.io/port: "9113"
    spec:
      containers:
        - name: nginx
          image: nginx
          volumeMounts:
            - name: nginx-conf
              mountPath: /etc/nginx
              readOnly: true
          ports:
            - containerPort: 80
        - name: nginx-exporter
          args:
          - "-nginx.scrape-uri=http://localhost:80/stub_status" # nginx address
          image: nginx/nginx-prometheus-exporter:0.9.0
          ports:
            - containerPort: 9113
      volumes:
        - name: nginx-conf
          configMap:
            name: nginx-conf
            items:
              - key: nginx.conf
                path: nginx.conf

現在,嘗試從 Prometheus 查詢nginx_up nginx-prometheus-exporter 還自帶了grafana 儀表盤,你也可以試試。

當 Prometheus 抓取端點時,它需要指標。 典型的指標如下所示:

# HELP go_gc_duration_seconds A summary of the GC invocation durations.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 1.3234e-05
go_gc_duration_seconds{quantile="0.25"} 1.7335e-05

"healthy"不符合標准,因此導致 Prometheus 無法抓取此目標。 blackbox exporter ,它旨在從用戶的角度檢查端點(這就是黑盒監控)。 導出器可以執行 HTTP 請求並對結果進行度量。 例如,它可以檢查響應代碼是否為 200,或者響應正文是否包含某些文本。 以下是此導出器返回的示例指標(注意probe_success ,這與up相同):

# HELP probe_dns_lookup_time_seconds Returns the time taken for probe dns lookup in seconds
# TYPE probe_dns_lookup_time_seconds gauge
probe_dns_lookup_time_seconds 0.026007318
# HELP probe_duration_seconds Returns how long the probe took to complete in seconds
# TYPE probe_duration_seconds gauge
probe_duration_seconds 0.550007522
# HELP probe_failed_due_to_regex Indicates if probe failed due to regex
# TYPE probe_failed_due_to_regex gauge
probe_failed_due_to_regex 0
# HELP probe_http_content_length Length of http content response
# TYPE probe_http_content_length gauge
probe_http_content_length -1
# HELP probe_http_duration_seconds Duration of http request by phase, summed over all redirects
# TYPE probe_http_duration_seconds gauge
probe_http_duration_seconds{phase="connect"} 0.098082009
probe_http_duration_seconds{phase="processing"} 0.154402544
probe_http_duration_seconds{phase="resolve"} 0.038066771
probe_http_duration_seconds{phase="tls"} 0.209702302
probe_http_duration_seconds{phase="transfer"} 0.047839785
# HELP probe_http_redirects The number of redirects
# TYPE probe_http_redirects gauge
probe_http_redirects 1
# HELP probe_http_ssl Indicates if SSL was used for the final redirect
# TYPE probe_http_ssl gauge
probe_http_ssl 1
# HELP probe_http_status_code Response HTTP status code
# TYPE probe_http_status_code gauge
probe_http_status_code 200
# HELP probe_http_uncompressed_body_length Length of uncompressed response body
# TYPE probe_http_uncompressed_body_length gauge
probe_http_uncompressed_body_length 87617
# HELP probe_http_version Returns the version of HTTP of the probe response
# TYPE probe_http_version gauge
probe_http_version 2
# HELP probe_ip_addr_hash Specifies the hash of IP address. It's useful to detect if the IP address changes.
# TYPE probe_ip_addr_hash gauge
probe_ip_addr_hash 8.57979034e+08
# HELP probe_ip_protocol Specifies whether probe ip protocol is IP4 or IP6
# TYPE probe_ip_protocol gauge
probe_ip_protocol 4
# HELP probe_ssl_earliest_cert_expiry Returns earliest SSL cert expiry in unixtime
# TYPE probe_ssl_earliest_cert_expiry gauge
probe_ssl_earliest_cert_expiry 1.639030838e+09
# HELP probe_ssl_last_chain_expiry_timestamp_seconds Returns last SSL chain expiry in timestamp seconds
# TYPE probe_ssl_last_chain_expiry_timestamp_seconds gauge
probe_ssl_last_chain_expiry_timestamp_seconds 1.639030838e+09
# HELP probe_ssl_last_chain_info Contains SSL leaf certificate information
# TYPE probe_ssl_last_chain_info gauge
probe_ssl_last_chain_info{fingerprint_sha256="ef4eaeb464efb33f5332b365a350b2b06588ea71837af27f83d45b726d19af2a"} 1
# HELP probe_success Displays whether or not the probe was a success
# TYPE probe_success gauge
probe_success 1
# HELP probe_tls_version_info Contains the TLS version used
# TYPE probe_tls_version_info gauge
probe_tls_version_info{version="TLS 1.2"} 1

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM