简体   繁体   中英

Kubernetes, Java and Grafana - How to display only the running containers?

I'm working on a setup where we run our Java services in docker containers hosted on a kubernetes platform.

On want to create a dashboard where I can monitor the heap usage of all instances of a service in my grafana. Writing metrics to statsd with the pattern:

<servicename>.<containerid>.<processid>.heapspace works well, I can see all heap usages in my chart.

After a redeployment, the container names change, so new values are added to the existing graph. My problem is, that the old lines continue to exist at the position of the last value received, but the containers are already dead.

Is there any simple solution for this in grafana? Can I just say: if you didn't receive data for a metric for more than X seconds, abort the chart line?

许多容器在14:00退出,但图表仍在继续

Update: Upgrading to the newest Grafana Version and Setting "null" as value for "Null value" in Stacking and Null Value didn't work.

Maybe it's a problem with statsd?

I'm sending data to statsd in form of:

felix.javaclient.machine<number>-<pid>.heap:<heapvalue>|g

Is anything wrong with this?

This can happen for 2 reasons, because grafana is using the "connected" setting for null values, and/or (as is the case here) because statsd is sending the previously-seen value for the gauge when there are no updates in the current period.

Grafana Config

You'll want to make 2 adjustments to your graph config:

First, go to the "Display" tab and under "Stacking & Null value" change "Null value" to "null", that will cause Grafana to stop showing the lines when there is no data for a series.

Second, if you're using a legend you can go to the "Legend" tab and under "Hide series" check the "With only nulls" checkbox, that will cause items to only be displayed in the legend if they have a non-null value during the graph period.

statsd Config

The statsd documentation for gauge metrics tells us:

If the gauge is not updated at the next flush, it will send the previous value. You can opt to send no metric at all for this gauge, by setting config.deleteGauges

So, the grafana changes alone aren't enough in this case, because the values in graphite aren't actually null (since statsd keeps sending the last reading). If you change the statsd config to have deleteGauges: true then statsd won't send anything and graphite will contain the null values we expect.

Graphite Note

As a side note, a setup like this will cause your data folder to grow continuously as you create new series each time a container is launched. You'll definitely want to look into removing old series after some period of inactivity to avoid filling up the disk. If you're using graphite with whisper that can be as simple as a cron task running find /var/lib/graphite/whisper/ -name '*.wsp' -mtime +30 -delete to remove whisper files that haven't been modified in the last 30 days.

To do this, I would use

maximumAbove(transformNull(felix.javaclient.*.heap, 0), 0)

The transformNull will take any datapoint that is currently null, or unreported for that instant in time, and turn it into a 0 value.

The maximumAbove will only display the series' that have a maximum value above 0 for the selected time period.

Using maximumAbove , you can see all history containers, if you wish to see only the currently running containers, you should use just that: currentAbove

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM