简体   繁体   English

用于运行pod和节点的Kubernetes prometheus指标?

[英]Kubernetes prometheus metrics for running pods and nodes?

I've set up prometheus to monitor kubernetes metrics by following the prometheus documentation . 我通过遵循prometheus 文档设置prometheus来监控kubernetes指标。

A lot of useful metrics now show up in prometheus. 许多有用的指标现在都出现在prometheus中。

However, I can't see any metrics referencing the status of my pods or nodes. 但是,我看不到任何引用我的pod或节点状态的指标。

Ideally - I'd like to be able to graph the pod status (Running, Pending, CrashLoopBackOff, Error) and nodes (NodeReady, Ready). 理想情况下 - 我希望能够绘制pod状态(Running,Pending,CrashLoopBackOff,Error)和节点(NodeReady,Ready)。

Is this metric anywhere? 这个指标在哪里? If not, can I add it somewhere? 如果没有,我可以将它添加到某个地方吗? And how? 如何?

The regular kubernetes setup does not expose these metrics - further discussion here . 常规kubernetes设置不会公开这些指标 - 这里进一步讨论。

However, another service can be used to collect these cluster level metrics: https://github.com/kubernetes/kube-state-metrics . 但是,可以使用其他服务来收集这些群集级别度量标准: https//github.com/kubernetes/kube-state-metrics

This currently provides node_status_ready and pod_container_restarts which sound like what I want. 这当前提供了node_status_ready和pod_container_restarts,听起来像我想要的。

I don't think such metrics exist. 我不认为这些指标存在。

You have to modify the source code to add them. 您必须修改源代码才能添加它们。 Take a look at this file on how to register a metric: https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/metrics/metrics.go , and take a look at this line on how to record a metric: https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/pleg/generic.go#L180 看看这个文件如何注册一个指标: https//github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/metrics/metrics.go ,并看看这一行如何记录指标: https//github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/pleg/generic.go#L180

I've found that I can monitor these metrics using heapster & snap, which is a plausible workaround for my case. 我发现我可以使用heapster&snap来监控这些指标,这对我的案例来说是一个看似合理的解决方法。 Let me know if that's something you're also using and I'll give you the proper metrics to get this data. 如果您正在使用这些内容,请告诉我,我会为您提供获取此数据的正确指标。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM