繁体   English   中英

prometheus 操作员没有抓取所有的 pod

[英]prometheus operator is not scraping all pods

我已经在 digitalocean 集群上部署了 prometheus-operator。 使用 kube-prometheus-stack https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack 我为 kubernetes pod 角色添加了一些额外的抓取配置。

- job_name: kubernetes-pods
      scrape_timeout: 5m
      kubernetes_sd_configs:
      - role: pod
      relabel_configs:
      - action: keep
        regex: true
        source_labels:
        - __meta_kubernetes_pod_annotation_prometheus_io_scrape
      - action: replace
        regex: (.+)
        source_labels:
        - __meta_kubernetes_pod_annotation_prometheus_io_path
        target_label: __metrics_path__
      - action: replace
        regex: ([^:]+)(?::\d+)?;(\d+)
        replacement: $1:$2
        source_labels:
        - __address__
        - __meta_kubernetes_pod_annotation_prometheus_io_port
        target_label: __address__
      - action: labelmap
        regex: __meta_kubernetes_pod_label_(.+)
      - action: replace
        source_labels:
        - __meta_kubernetes_namespace
        target_label: kubernetes_namespace
      - action: replace
        source_labels:
        - __meta_kubernetes_pod_name
        target_label: kubernetes_pod_name
      - action: drop
        regex: Pending|Succeeded|Failed
        source_labels:
        - __meta_kubernetes_pod_phase

之后,我通过在 postgres deploy yaml 文件中提供 prometheus 注释,在 postgres 命名空间上部署了 postgres db,这是我的文件

kind: Service
metadata:
  annotations:
    prometheus.io/scrape: "true"
    prometheus.io/port: "postgres"
    prometheus.io/path: "/metrics"
    prometheus.io/probe: "true"
  namespace: postgres
  name: postgres
  labels:
    component: postgres
spec:

  ports:
    - port: 5432
      name: postgres
  selector:
    component: postgres
---
# Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: postgres
  namespace: postgres
  labels:
    component: postgres
spec:
  selector:
    matchLabels:
      component: postgres
  template:
    metadata:
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "postgres"
        prometheus.io/path: "/metrics"
      labels:
        component: postgres
    spec:
      containers:
      - image: postgres:11.1-alpine
        name: postgres
        securityContext:
          runAsUser: 70 # postgres user on Alpine
          allowPrivilegeEscalation: false
        resources:
          limits:
            memory: 2Gi
            cpu: 2
          requests:
            memory: 2Gi
            cpu: 2
        env:
          - name: PGDATA
            value: "/var/lib/postgresql/data/pgdata"
          - name: POSTGRES_DB
            valueFrom:
              secretKeyRef:
                name: postgres-secret
                key: PG_DATABASE
          - name: POSTGRES_USER
            valueFrom:
              secretKeyRef:
                name: postgres-secret
                key: PG_USERNAME
          - name: POSTGRES_PASSWORD
            valueFrom:
              secretKeyRef:
                name: postgres-secret
                key: PG_PASSWORD
        ports:
        - containerPort: 5432
          name: postgres
        volumeMounts:
          - name: postgres-storage
            mountPath: /var/lib/postgresql/data
        readinessProbe:
          tcpSocket:
            port: 5432
          initialDelaySeconds: 15
          periodSeconds: 5
        livenessProbe:
          exec:
            command:
            - pg_isready
            - -h
            - localhost
            - -U
            - test_user
            - -d
            - test_db
          initialDelaySeconds: 10
          periodSeconds: 5
      volumes:
        - name: postgres-storage
          emptyDir: {}

但是在 postgres pod 的 prometheus 目标中,我收到 EOF 错误,有人可以帮我解决这个问题吗?

您需要使用导出器以 prometheus 格式导出指标。

您不能只向 postgres 端口发送 HTTP 请求,希望它能为您提供一些指标。

使用postgress 导出器 它与 postgres 对话并在 http 端点下公开指标。 更重要的是:采用普罗米修斯能够理解的格式。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM