简体   繁体   English

R 使用合并列和行的热图

[英]R Heatmap using binned columns and rows

I have a very large table (403k rows) which contains some categorical contious performance values (flow, pressure ect.) which I would like to plot against the sold value.我有一个非常大的表(403k 行),其中包含一些分类的连续性能值(流量、压力等),我想将其与销售价值进行对比。 I would like to create a heatmap or contour plot from it, using binned values on Q, W, and E, with the heatmap showing the Sales so that I can aggregate the sold values.我想从中创建一个热图或轮廓 plot,使用 Q、W 和 E 上的分箱值,热图显示销售额,以便我可以汇总销售值。 For the example, lets set the table ( df ) as:例如,让我们将表( df )设置为:

Q<-c(0.5,1,2,3,3.5,4,4,3,3,4,1,2)
W<-c(1,0.5,2,3,3,4,4,2,1,2,2,1)
E<-c(2,2,2,1,1,5,5,2,3,4,4,1)
Sales<-c(5,30,30,5,10,10,5,5,5,12,20,40)
df <- data.frame(Q = Q, W = W, E = E, Sales = Sales)

In my real table, Q is actually value that ranges from 0 to 40, where H ranges from 0 to 20 and P from 20 to 1000. I have tried making a ggplot and scaling the color with ggsci using ggplot(df) + geom_tile(aes(x = Q, y = W, fill = Sales), color = NA) + scale_fill_gsea() however this produces some tiny dots, which are hardly readable (see picture).在我的真实表格中,Q 实际上是从 0 到 40 的值,其中 H 的范围是 0 到 20,P 的范围是 20 到 1000。我尝试使用ggsci ggplot(df) + geom_tile(aes(x = Q, y = W, fill = Sales), color = NA) + scale_fill_gsea()但这会产生一些难以阅读的小点(见图)。 Hence, I think tile does not bin or aggregate the Q and W values along with Sales(?)因此,我认为 tile 不会将 Q 和 W 值与 Sales(?)

在此处输入图像描述

What I am trying to create is something more like this (ugly) thing which I quickly made in Excel for this example:我正在尝试创建的东西更像是这个(丑陋的)东西,我在 Excel 中快速制作了这个例子:

在此处输入图像描述

Now I'm no expert at all, so I was hoping that someone out there knows how to plot this in a neat and elegant way, either via a heatmap, or maybe a 2d density plot of some kind?现在我根本不是专家,所以我希望那里有人知道如何通过热图或某种二维密度 plot 以一种简洁优雅的方式进行 plot 这个?

EDIT : If i use ggplot(df, aes(Q,H)) + geom_hex(color = df$Sales) I get an error, and using just geom_hex() gives me something closer, but the colors does not scale according to the sales amount.编辑:如果我使用ggplot(df, aes(Q,H)) + geom_hex(color = df$Sales)我得到一个错误,并且只使用geom_hex()给了我更接近的东西,但是 colors 不会根据销售额。

EDIT : Added "half" an answer in the bottom, using geom_bin2d() , which goes along with geom_hex() .编辑:在底部添加了“一半”答案,使用geom_bin2d() ,它与geom_hex()一起使用。

I found a way to accomplish my question (read below).我找到了一种方法来完成我的问题(请阅读下文)。 Other suggestions on how to visualize it elegantly are much appreciated!非常感谢有关如何优雅地可视化它的其他建议!

ggplot(df, aes(x = Q, y = W, z = Sales)) + stat_bin2d(bins = 10) +
  stat_summary_2d(bins = 10, fun = function(x) (x)) +
  stat_summary_2d(bins = 10, aes(label = ..value..), fun = function(x) sum(x), geom="text") +
  scale_fill_gradient(labels = comma, names = "Sales", low = "lightblue", high = "green", trans = "log10") 

在此处输入图像描述

EDIT : Updated my answer.编辑:更新了我的答案。 Now the issue is to scale the colour correctly (see image above)现在的问题是正确缩放颜色(见上图)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM