配置 prometheus 以从 dockerized nodejs pod 收集自定义指标

Question

我已经设置了 prom-client（prometheus 的非官方客户端库）来收集我需要的自定义指标。 我按照这个eks setup guide从 helm 部署了 prometheus 服务器。 现在我正在尝试编辑默认配置映射以收集我的应用程序指标，但出现错误

parsing YAML file /etc/config/prometheus.yml: yaml: unmarshal errors:\n line 22: field cluster_ip not found in type kubernetes.plain\n line 25: cannot unmarshal !!str default into []string

这是我按照文档 prometheus.yaml configmap 文件所做的

apiVersion: v1
data:
  alerting_rules.yml: |
    {}
  alerts: |
    {}
  prometheus.yml: |
    global:
      evaluation_interval: 1m
      scrape_interval: 1m
      scrape_timeout: 10s
    rule_files:
    - /etc/config/recording_rules.yml
    - /etc/config/alerting_rules.yml
    - /etc/config/rules
    - /etc/config/alerts
    scrape_configs:
    ...DEFAULT CONFIGS...
    - job_name: my_metrics
      scrape_interval: 5m
      scrape_timeout: 10s
      honor_labels: true
      metrics_path: /api/metrics
      kubernetes_sd_configs:
        - role: service
          cluster_ip: 10.100.200.92
          namespaces:
            names:
              default
  recording_rules.yml: |
    {}
  rules: |
    {}
kind: ConfigMap
metadata:
  creationTimestamp: "2020-06-08T09:26:38Z"
  labels:
    app: prometheus
    chart: prometheus-11.3.0
    component: server
    heritage: Helm
    release: prometheus
  name: prometheus-server
  namespace: prometheus
  uid: 8fadb17a-f5c5-4f9d-a931-fa1f77684847

这里的 clusterIP 是分配给我的服务以公开部署的 IP。

我的部署.yaml 文件

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
spec:
  replicas: 2
  strategy:
    type: RollingUpdate
  selector:
    matchLabels:
      name: myapp
  template:
    metadata:
      labels:
        name: myapp
    spec:
      containers:
        - image: IMAGE_URL:BUILD_NUMBER
          name: myapp
          resources:
              limits:
                cpu: "1000m"
                memory: "2400Mi"
              requests:
                cpu: "500m"
                memory: "2000Mi"
          imagePullPolicy: IfNotPresent
          ports:
              - containerPort: 5000
                name: myapp

我的 service.yaml 文件暴露了部署

apiVersion: v1
kind: Service
metadata:
  name: myapp
spec:
  selector:
    deploy: staging
    name: myapp
  type: ClusterIP
  ports:
    - port: 80
      targetPort: 5000
      protocol: TCP

是否有一些不同/有效的方法可以针对我的应用程序进行指标收集，请告诉我。 谢谢

Answer 1

这就是我用来在集群内启用普罗米修斯抓取的方法。

在抓取配置中，我有这个片段：

      - job_name: 'kubernetes-pods'
        kubernetes_sd_configs:
          - role: pod
        relabel_configs:
          - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
            action: keep
            regex: true
          - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
            action: replace
            target_label: __metrics_path__
            regex: (.+)
          - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
            action: replace
            regex: ([^:]+)(?::\d+)?;(\d+)
            replacement: $1:$2
            target_label: __address__
          - action: labelmap
            regex: __meta_kubernetes_pod_label_(.+)
          - source_labels: [__meta_kubernetes_namespace]
            action: replace
            target_label: kubernetes_namespace
          - action: labeldrop
            regex: '(kubernetes_pod|app_kubernetes_io_instance|app_kubernetes_io_name|instance)'

这直接取自 prometheus helm 图表的默认值： https://github.com/helm/charts/blob/master/stable/prometheus/values.yaml#L1452

这样做的目的是指示 prometheus 抓取具有注释的每个 pod： prometheus.io/scrape: "true"集。 使用pod上的这些注释，您可以配置抓取的端口和路径：

prometheus.io/path: "/metrics"
prometheus.io/port: "9090"

因此，您还需要修改您的deployment.yaml以指定这些注释：

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
spec:
  replicas: 2
  strategy:
    type: RollingUpdate
  selector:
    matchLabels:
      name: myapp
  template:
    metadata:
      labels:
        name: myapp
    annotations:
      prometheus.io/scrape: "true"
      prometheus.io/port: "<enter port of pod to scrape>"
      prometheus.io/path: "<enter path to scrape>"
    spec:
      containers:
        - image: IMAGE_URL:BUILD_NUMBER
...

配置 prometheus 以从 dockerized nodejs pod 收集自定义指标

问题描述

1 个解决方案

解决方案1
2 已采纳 2020-06-22 12:56:22

配置 prometheus 以从 dockerized nodejs pod 收集自定义指标

问题描述

1 个解决方案

解决方案1 2 已采纳 2020-06-22 12:56:22

解决方案1
2 已采纳 2020-06-22 12:56:22