简体   繁体   English

Helm Prometheus 操作员不会向目标添加新的 ServiceMonitor 端点

[英]Helm Prometheus operator doesn't add new ServiceMonitor endpoints to targets

I'm trying to monitor my app using helm prometheus https://github.com/prometheus-community/helm-charts .我正在尝试使用 helm prometheus https://github.com/prometheus-community/helm-charts监控我的应用程序。 I've installed this helm chart successfully.我已经成功安装了这个掌舵图。

prometheus-kube-prometheus-operator-5d8dcd5988-bw222   1/1     Running   0          11h
prometheus-kube-state-metrics-5d45f64d67-97vxt         1/1     Running   0          11h
prometheus-prometheus-kube-prometheus-prometheus-0     2/2     Running   0          11h
prometheus-prometheus-node-exporter-gl4cz              1/1     Running   0          11h
prometheus-prometheus-node-exporter-mxrsm              1/1     Running   0          11h
prometheus-prometheus-node-exporter-twvdb              1/1     Running   0          11h

App Service and Deployment created in the same namespace, by these yml configs:通过这些 yml 配置在同一命名空间中创建的应用服务和部署:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: appservice
  namespace: monitoring
  labels:
    app: appservice
  annotations:
    prometheus.io/scrape: 'true'
    prometheus.io/path: '/actuator/prometheus'
spec:
  replicas: 1
  selector:
    matchLabels:
      app: appservice
  template:
    metadata:
      labels:
        app: appservice
...
apiVersion: v1
kind: Service
metadata:
  name: appservice
  namespace: monitoring
  annotations:
    prometheus.io/scrape: 'true'
    prometheus.io/path: '/actuator/prometheus'
spec:
  selector:
    app: appservice
  type: ClusterIP
  ports:
    - name: web
      protocol: TCP
      port: 8080
      targetPort: 8080
    - name: jvm-debug
      protocol: TCP
      port: 5005
      targetPort: 5005

And after app was deployed, I had created ServiceMonitor:部署应用程序后,我创建了 ServiceMonitor:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: appservice-servicemonitor
  namespace: monitoring
  labels:
    app: appservice
    release: prometheus-repo
spec:
  selector:
    matchLabels:
      app: appservice # target app service
  namespaceSelector:
    matchNames:
      - monitoring
  endpoints:
  - port: web
    path: '/actuator/prometheus'
    interval: 15s

I expect that after adding this ServiceMonitor, my prometheus instance create new target``` like "http://appservice:8080/actuator/prometheus", but it is not, new endpoints doesn't appears in prometheus UI.我希望在添加此 ServiceMonitor 后,我的 prometheus 实例会创建新的目标```,如“http://appservice:8080/actuator/prometheus”,但事实并非如此,新端点不会出现在 prometheus UI 中。

I tried to change helm values by adding additionalServiceMonitors我试图通过添加 additionalServiceMonitors 来更改 helm 值

namespaceOverride: "monitoring"
nodeExporter:
  enabled: true

prometheus:
  enabled: true
  prometheusSpec:
    serviceMonitorSelectorNilUsesHelmValues: false
    serviceMonitorSelector:
      matchLabels:
       release: prometheus-repo
    additionalServiceMonitors:
      namespaceSelector:
        any: true
    replicas: 1
    shards: 1
    storageSpec:
      ...
    securityContext:
      ...
    nodeSelector:
      assignment: monitoring

  nodeSelector:
    assignment: monitoring

prometheusOperator:
  nodeSelector:
    assignment: monitoring
  admissionWebhooks:
    patch:
      securityContext:
        ...
  securityContext:
    ...

global:
  alertmanagerSpec:
    nodeSelector:
      assignment: monitoring

But it didn't help.但这没有帮助。 It is really hard to say what is going wrong, no error logs, all configs applies successfully.真的很难说出了什么问题,没有错误日志,所有配置都成功应用。

I found this guide very helpful.我发现本指南非常有帮助。

Please keep in mind that depending on the prometheus stack you are using labels and names can have different default values (for me, using kube-prometheus-stack, for example the secret name was prometheus-kube-prometheus-stack-prometheus instead of prometheus-k8s).请记住,根据您使用的 prometheus 堆栈,标签和名称可以有不同的默认值(对我来说,使用 kube-prometheus-stack,例如秘密名称是 prometheus-kube-prometheus-stack-prometheus 而不是 prometheus -k8s)。

Essential quotes:基本报价:

服务监控参考

Has my ServiceMonitor been picked up by Prometheus?我的 ServiceMonitor 是否已被 Prometheus 取走?

ServiceMonitor objects and the namespace where they belong are selected by the serviceMonitorSelector and serviceMonitorNamespaceSelectorof a Prometheus object. ServiceMonitor 对象及其所属的命名空间由 Prometheus 对象的 serviceMonitorSelector 和 serviceMonitorNamespaceSelector 选择。 The name of a ServiceMonitor is encoded in the Prometheus configuration, so you can simply grep whether it is present there. ServiceMonitor 的名称在 Prometheus 配置中进行了编码,因此您可以简单地 grep 它是否存在。 The configuration generated by the Prometheus Operator is stored in a Kubernetes Secret, named after the Prometheus object name prefixed with prometheus- and is located in the same namespace as the Prometheus object. Prometheus Operator 生成的配置存储在 Kubernetes Secret 中,以 Prometheus 对象名称命名,前缀为 prometheus-,并且与 Prometheus 对象位于同一命名空间中。 For example for a Prometheus object called k8s one can find out if the ServiceMonitor named my-service-monitor has been picked up with:例如,对于名为 k8s 的 Prometheus 对象,可以查明名为 my-service-monitor 的 ServiceMonitor 是否已通过以下方式获取:

kubectl -n monitoring get secret prometheus-k8s -ojson | jq -r '.data["prometheus.yaml.gz"]' | base64 -d | gunzip | grep "my-service-monitor

You can analyze using the prometheus web interface:可以使用prometheus web接口进行分析:

(1) Check if the ServiceMonitor config appears in the prometheus config: http://localhost:9090/config If you can't find your config, I would check, if your config is valid and deployed to the cluster. (1) 检查ServiceMonitor配置是否出现在prometheus配置中: http://localhost:9090/config 如果你找不到你的配置,我会检查你的配置是否有效并部署到集群。

(2) Check if prometheus can discover pods via this config: http://localhost:9090/service-discovery (2) 检查prometheus是否可以通过这个配置发现pods:http://localhost:9090/service-discovery

If the service discovery can't find your pods, I would compare all values which are required by the config to the labels provided by your pods.如果服务发现找不到您的 pod,我会将配置所需的所有值与您的 pod 提供的标签进行比较。

(3) If the service discovery has selected your services, check the targets page: http://localhost:9090/targets (3)如果服务发现选择了你的服务,查看targets页面:http://localhost:9090/targets

Here you will see if the prometheus endpoints are healthy and accessible by prometheus.在这里,您将看到 prometheus 端点是否健康并且可以被 prometheus 访问。

Recently I had a case that after upgrade of ArgoCD the default annotation which they are using to to determine which resources are form the app changed.最近我有一个案例,在升级 ArgoCD 之后,他们用来确定哪些资源来自应用程序的默认注释发生了变化。

It's right now app.kube.netes.io/instance which could conflict (override) the 'expected' release name which Helm generate.现在app.kube.netes.io/instance可能会与 Helm 生成的“预期”版本名称发生冲突(覆盖)。 As a outcome release name could be mixed with ArgoCD app instance name.作为结果,发布名称可以与 ArgoCD 应用程序实例名称混合。 In this case you could end with annotations values like my-release-name and for example dev-my-release-name (if you ArgoCD app is different that release name defined in the app).在这种情况下,您可以以注释值结束,例如my-release-namedev-my-release-name (如果您的 ArgoCD 应用程序与应用程序中定义的版本名称不同)。

After that most of my service monitors stopped working as service monitor CRD annotations didn't match the service annotations.之后,我的大部分服务监视器都停止工作,因为服务监视器 CRD 注释与服务注释不匹配。 The solution was to not use app.kube.netes.io/instance annotation to mark the resources managed by that tool.解决方案是不使用app.kube.netes.io/instance注释来标记该工具管理的资源。

Due to above I recommend to use always argocd.argoproj.io/instance instead of default one if you have release name set for ArgoCD apps.由于上述原因,如果您为 ArgoCD 应用程序设置了发布名称,我建议始终使用argocd.argoproj.io/instance而不是默认的。

https://argo-cd.readthedocs.io/en/stable/faq/#why-is-my-app-out-of-sync-even-after-syncing https://argo-cd.readthedocs.io/en/stable/faq/#why-is-my-app-out-of-sync-even-after-syncing

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM