简体   繁体   中英

Summarize CollectD CPU-Stats for multiple servers with different CPU count across servers

I'm trying to forge a graph that shows worst-case CPU usage across a variable set of servers. I'm getting the data from collectd, which reports statistics for each CPU core separately. The problem is that servers within the set may have different amounts of CPU cores.

What I had so far (one series for each cpu-foo property): sumSeriesWithWildcards(sumSeriesWithWildcards(summarize(servers.$foo.$bar.*.collectd.cpu-*.cpu-system.value, '$timeframe', 'max', false), 5), 3)

This skews the graph towards cpu-idle, obviously, because the servers are for the most part evenly loaded, so servers with more CPU cores show a higher idle ratio than servers with less cores.

To clarify this: I'd like to summarize all cpu-* series sums of each server to the max across all servers, except for idle, which I'd like to summarize to the min. Because of that I need a way to normalize each servers sums to 100% before summarizing them.

So far I have come to this, which is a little bit better: divideSeries(sumSeriesWithWildcards(sumSeriesWithWildcards(summarize(servers.$foo.$bar.*.collectd.cpu-*.cpu-system.value, '$timeframe', 'max', false), 5), 3), #L)

However, this still isn't satisfactory. It's not as skewed but it still does not fulfill the purpose of this graph: To show worst case CPU usage across servers.

What I'd need to do but can't figure out how to do it is the following:

  1. for each in segment 3 (server), count cpu-*, then
  2. sum each cpu-*.foo for this server and divide it by the count from 1.
  3. sum each from 2. and summarize

What's missing to me is step 2. Basically, I need a way to normalize the different CPU values for each server before summing them for all.

Is there any way to do this?

Edit: This, of course, would be useful for other metrics as well that are note uniform across servers, eg RAM.

Try this:

summarize(sumSeries(averageSeriesWithWildcards(servers.$foo.$bar.*.collectd.cpu-*.cpu-system.value, 5)), '$timeframe', 'max', false)

I'm not sure it will work, but I believe it follows the steps you outlined and perhaps you can tune it to make it work. :) See the docs about Graphite functions .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM