简体繁体 English

为什么我的DataDog实例报告一个Kubernetes“ no_pod”？

[英]Why is my DataDog instance reporting a Kubernetes “no_pod”?

原文 2016-11-22 12:24:20 2 1 kubernetes/ kubernetes-health-check/ datadog

We are running a Kubernetes Cluster in AWS and we are collecting the metrics in DataDog using the dd-agent DaemonSet. 我们正在AWS中运行Kubernetes集群，并且正在使用dd-agent DaemonSet在DataDog中收集指标。

We have a Pod being displayed in our metrics tagged as "no_pod" and it is using a lot of resources, Memory/CPU/NetworkTx/NetworkRX. 我们在指标中显示了一个标记为“ no_pod”的Pod，该Pod正在使用大量资源，即Memory / CPU / NetworkTx / NetworkRX。

Is there any explanation to what this pod is, how I can find it, kill it, restart it etc? 是否有关于此Pod是什么的解释，如何找到它，杀死它，重新启动它等等？

I have found the dd-agent source code which seems to define the "no_pod" label but I can't make much sense of why it is there, where it is coming from and how I can find it through kubectl etc. 我发现了dd-agent 源代码，该源代码似乎定义了“ no_pod”标签，但我不太了解为什么它存在，它来自何处以及如何通过kubectl等找到它。

1 个解决方案

After speaking to the support team at DataDog, I managed to find out the following information relating to what the no_pod pods were. 与DataDog的支持团队交谈后，我设法找到了有关no_pod pod的内容的以下信息。

Our Kubernetes check is getting the list of containers from the Kubernetes API, which exposes aggregated data. 我们的Kubernetes检查是从Kubernetes API获取容器列表，该API公开聚合数据。 In the metric explorer configuration here, you can see a couple of containers named /docker and / that are getting picked up along with the other containers. 在此处的度量标准资源管理器配置中，您可以看到几个名为/ docker和/的容器以及其他容器。 Metrics with pod_name:no_pod that come from container_name:/ and container_name:/docker are just metrics aggregated across multiple containers. 来自container_name：/和container_name：/ docker的pod_name：no_pod的指标只是跨多个容器聚合的指标。 (So it makes sense that these are the highest values in your graphs.) If you don't want your graphs to show these aggregated container metrics though, you can clone the dashboard and then exclude these pods from the query. （因此，这些是图表中的最高值是有道理的。）如果您不希望图表显示这些聚合的容器指标，则可以克隆仪表板，然后从查询中排除这些吊舱。 To do so, on the cloned dashboard, just edit the query in the JSON tab, and in the tag scope, add !pod_name:no_pod. 为此，在克隆的仪表板上，只需在JSON选项卡中编辑查询，然后在标签范围内添加！pod_name：no_pod。

So it appears that these pods are the docker and root level containers running outside of the cluster and will always display unless you want to filter them out specifically which I now do. 因此，看来这些Pod是在集群外部运行的docker和root级容器，并且除非您想专门过滤掉它们（除非我现在这样做），否则它们始终显示。

Many thanks to the support guys at DataDog for looking into the issue for me and giving me a great explanation as to what the pods were and essentially confirming that I can just safely filter these out and not worry about them. 非常感谢DataDog的支持人员为我调查了这个问题，并为我提供了有关Pod是什么的很好的解释，并从本质上确认了我可以安全地过滤掉它们而不必担心它们。