数据中心的R CPU利用率热图

Question

I have readings from 100 servers in the data center. 我从数据中心的100台服务器中读取了数据。 The readings are in a data frame format of time having 3 columns Time , host name , CPU Utilization . 读数采用时间的数据帧格式，具有3列Time ， host name ， CPU Utilization 。 The readings are every 10 mins generated by a monitoring system. 监测系统每10分钟产生一次读数。 I need to plot a heat map of CPU utilization, with time on X axis and % of servers on Y axis with CPU utilization range in Heat map. 我需要绘制CPU使用率的热图，在X轴上显示time ，在Y轴上% of servers并在Heat map中绘制CPU utilization范围。

For example : If total number of servers is 5 . 例如：如果服务器总数为5 。 The input data is as follows 输入数据如下

Time            CPU Hostname
1/25/2015 10:15  19%    H1
1/25/2015 10:15  90%    H2
1/25/2015 10:15  90%    H3
1/25/2015 10:15  50%    H4
1/25/2015 10:15  25%    H5
1/25/2015 10:25  30%    H1
1/25/2015 10:25  85%    H2
1/25/2015 10:25  30%    H3
1/25/2015 10:25  21%    H4
1/25/2015 10:25  21%    H5

The output required is a stacked chart to depict the following figures in a heat map. 所需的输出是一个堆叠图，以热图显示以下图形。

For example at 10:15 there are 2 servers in range of 80-100% utilization and hence value is 40% 例如，在10:15 ，有2台服务器的利用率为80-100% ，因此价值为40%

Range   10:15   10:25
0-20    20%      0%
20-40   20%      80%
40-60   20%      0%
60-80   0%       0%
80-100  40%      20%

Need help on the functions in R to plot this kind of heat map. 在R中的功能上需要帮助以绘制此类热图。 Have tried to use xts but I am not clear on this use case of how to apply the xts package. 曾尝试使用xts但在此应用案例中我不清楚如何应用xts包。

Answer 1

You just need to: 您只需要：

cut values into the groups you need cut价值分成所需的组
find the % per group 找到每组的百分比
expand out missing entries expand缺少的条目
use geom_tile for your heatmap 使用geom_tile作为您的热图

Many of the components of following code are in many SO posts: 以下代码的许多组件都在许多SO帖子中：

library(dplyr)
library(ggplot2)
library(tidyr)
library(scales)

dat <- read.table(text="Time,CPU,Hostname
1/25/2015 10:15,19%,H1
1/25/2015 10:15,90%,H2
1/25/2015 10:15,90%,H3
1/25/2015 10:15,50%,H4
1/25/2015 10:15,25%,H5
1/25/2015 10:25,30%,H1
1/25/2015 10:25,85%,H2
1/25/2015 10:25,30%,H3
1/25/2015 10:25,21%,H4
1/25/2015 10:25,21%,H5", header=TRUE, sep=",", stringsAs=FALSE)


total_hosts <-length(unique(dat$Hostname))

dat %>%
  mutate(Time=as.POSIXct(Time, format="%m/%d/%Y %H:%M"),
         Day=format(Time, format="%Y-%m-%d"),
         HM=format(Time, format="%H:%M"),
         CPU=as.numeric(gsub("%", "", CPU)),
         `CPU Range`=as.character(cut(CPU, 
                                breaks=c(0,20,40,60,80,100), 
                                labels=c("0-20", "20-40", "40-60", 
                                         "60-80", "80-100")))) %>%
  group_by(Day, `CPU Range`, HM) %>%
  summarise(Pct=n()/total_hosts) %>%
  merge(expand(., `CPU Range`, HM, Day), all.y=TRUE) -> dat

gg <- ggplot(dat, aes(x=HM, y=`CPU Range`))
gg <- gg + geom_tile(aes(fill=Pct), color="#7f7f7f")
gg <- gg + scale_fill_distiller(palette="RdPu", na.value="white", 
                                label=percent, name="% Hosts")
gg <- gg + coord_equal()
gg <- gg + labs(x=NULL)
gg <- gg + theme_bw()
gg <- gg + theme(panel.border=element_blank())
gg <- gg + theme(panel.grid=element_blank())
gg

在此处输入图片说明

I left the Day in the data frame in case you want/need to facet_wrap by it or aggregate by it. 我将“ Day保留在数据框中，以防您需要/需要进行facet_wrap或进行汇总。

数据中心的R CPU利用率热图

问题描述

1 个解决方案

解决方案1
2 2015-01-25 12:55:46

数据中心的R CPU利用率热图

问题描述

1 个解决方案

解决方案1 2 2015-01-25 12:55:46

解决方案1
2 2015-01-25 12:55:46