简体   繁体   English

R:mgcv 将颜色条添加到 GAM 的 2D 热图

[英]R:mgcv add colorbar to 2D heatmap of GAM

I'm fitting a gam with mgcv and plot the result with the default plot.gam() function.我正在使用mgcv拟合 gam 并使用默认的plot.gam()函数绘制结果。 My model includes a 2D-smoother and I want to plot the result as a heatmap.我的模型包括一个 2D 平滑器,我想将结果绘制为热图。 Is there any way to add a colorbar for the heatmap?有没有办法为热图添加颜色条?

I've previously looked into other GAM potting packages, but none of them provided the necessary visualisation.我以前研究过其他 GAM 灌封包,但没有一个提供必要的可视化。 Please note, this is just a simplification for illustration purposes;请注意,这只是出于说明目的的简化; the actual model (and reporting needs) is much more complicated实际模型(和报告需求)要复杂得多

edited: I initially had swapped y and z in my tensor product, updated to reflect the correct version both in the code and the plot编辑:我最初在我的张量产品中交换了 y 和 z,更新以反映代码和图中的正确版本

df.gam<-gam(y~te(x,z), data=df, method='REML')
plot(df.gam, scheme=2, hcolors=heat.colors(999, rev =T), rug=F)

二维 GAM 热图

sample data:样本数据:

structure(list(x = c(3, 17, 37, 9, 4, 11, 20.5, 11.5, 16, 17, 
18, 15, 13, 29.5, 13.5, 25, 15, 13, 20, 20.5, 17, 11, 11, 5, 
16, 13, 3.5, 16, 16, 5, 20.5, 2, 20, 9, 23.5, 18, 3.5, 16, 23, 
3, 37, 24, 5, 2, 9, 3, 8, 10.5, 37, 3, 9, 11, 10.5, 9, 5.5, 8, 
22, 15.5, 18, 15, 3.5, 4.5, 20, 22, 4, 8, 18, 19, 26, 9, 5, 18, 
10.5, 30, 15, 13, 27, 19, 5.5, 18, 11.5, 23.5, 2, 25, 30, 17, 
18, 5, 16.5, 9, 2, 2, 23, 21, 15.5, 13, 3, 24, 17, 4.5), z = c(144, 
59, 66, 99, 136, 46, 76, 87, 54, 59, 46, 96, 38, 101, 84, 64, 
92, 56, 69, 76, 93, 109, 46, 124, 54, 98, 131, 89, 69, 124, 105, 
120, 69, 99, 84, 75, 129, 69, 74, 112, 66, 78, 118, 120, 103, 
116, 98, 57, 66, 116, 108, 95, 57, 41, 20, 89, 61, 61, 82, 52, 
129, 119, 69, 61, 136, 98, 94, 70, 77, 108, 118, 94, 105, 52, 
52, 38, 73, 59, 110, 97, 87, 84, 119, 64, 68, 93, 94, 9, 96, 
103, 119, 119, 74, 52, 95, 56, 112, 78, 93, 119), y = c(96.535, 
113.54, 108.17, 104.755, 94.36, 110.74, 112.83, 110.525, 103.645, 
117.875, 105.035, 109.62, 105.24, 119.485, 107.52, 107.925, 107.875, 
108.015, 115.455, 114.69, 116.715, 103.725, 110.395, 100.42, 
108.79, 110.94, 99.13, 110.935, 112.94, 100.785, 110.035, 102.95, 
108.42, 109.385, 119.09, 110.93, 99.885, 109.96, 116.575, 100.91, 
114.615, 113.87, 103.08, 101.15, 98.68, 101.825, 105.36, 110.045, 
118.575, 108.45, 99.21, 109.19, 107.175, 103.14, 94.855, 108.15, 
109.345, 110.935, 112.395, 111.13, 95.185, 100.335, 112.105, 
111.595, 100.365, 108.75, 116.695, 110.745, 112.455, 104.92, 
102.13, 110.905, 107.365, 113.785, 105.595, 107.65, 114.325, 
108.195, 96.72, 112.65, 103.81, 115.93, 101.41, 115.455, 108.58, 
118.705, 116.465, 96.89, 108.655, 107.225, 101.79, 102.235, 112.08, 
109.455, 111.945, 104.11, 94.775, 110.745, 112.44, 102.525)), row.names = c(NA, 
-100L), class = "data.frame")

It would be easier (IMHO) to do this reliably within the ggplot2 ecosphere.在 ggplot2 生态圈内可靠地做到这一点会更容易(恕我直言)。

I'll show a canned approach using my {gratia} package but also checkout {mgcViz}.我将使用我的 {gratia} 包展示一个罐头方法,但也会结帐 {mgcViz}。 I'll also suggest a more generic solution using tools from {gratia} to extra information about your model's smooths and then plot them yourself using ggplot() .我还将建议一个更通用的解决方案,使用 {gratia} 中的工具来获取有关模型平滑度的额外信息,然后使用ggplot()自己绘制它们。

library('mgcv')
library('gratia')
library('ggplot2')
library('dplyr')

# load your snippet of data via df <- structure( .... )

# then fit your model (note you have y as response & in the tensor product
# I assume z is the response below and x and y are coordinates
m <- gam(z ~ te(x, y), data=df, method='REML')

# now visualize the mode using {gratia}
draw(m)

This produces:这产生:

在此处输入图片说明

{gratia}'s draw() methods can't plot everything yet, but where it doesn't work you should still be able to evaluate the data you need using tools in {gratia}, which you can then plot with ggplot() itself by hand. {gratia} 的draw()方法还不能绘制所有内容,但是在它不起作用的地方,您仍然应该能够使用 {gratia} 中的工具评估您需要的数据,然后您可以使用ggplot()绘制这些ggplot()自己动手。

To get values for your smooths, ie the data behind the plots that plot.gam() or draw() display, use gratia::smooth_estimates()要获取平滑值,即plot.gam()draw()显示的绘图背后的数据,请使用gratia::smooth_estimates()

# dist controls what we do with covariate combinations too far
# from support of the data. 0.1 matches mgcv:::plot.gam behaviour
sm <- smooth_estimates(m, dist = 0.1)

yielding屈服

r$> sm                                                                          
# A tibble: 10,000 × 7
   smooth  type   by      est    se     x     y
   <chr>   <chr>  <chr> <dbl> <dbl> <dbl> <dbl>
 1 te(x,y) Tensor NA     35.3 11.5      2  94.4
 2 te(x,y) Tensor NA     35.5 11.0      2  94.6
 3 te(x,y) Tensor NA     35.7 10.6      2  94.9
 4 te(x,y) Tensor NA     35.9 10.3      2  95.1
 5 te(x,y) Tensor NA     36.2  9.87     2  95.4
 6 te(x,y) Tensor NA     36.4  9.49     2  95.6
 7 te(x,y) Tensor NA     36.6  9.13     2  95.9
 8 te(x,y) Tensor NA     36.8  8.78     2  96.1
 9 te(x,y) Tensor NA     37.0  8.45     2  96.4
10 te(x,y) Tensor NA     37.2  8.13     2  96.6
# … with 9,990 more rows

In the output, x and y are a grid of values over the range of both covariates (the number of points in the grid in each covariate is controlled by n such that the grid for a 2d tensor product smooth is of size n by n ).在输出中, xy是两个协变量范围内的值的网格(每个协变量中网格中的点数由n控制,使得 2d 张量积平滑的网格大小为n × n ) . est is the estimated value of the smooth at the values of the covariates and se its standard error. est是估计值中的协变量的值的平滑和se其标准误差。 For models with multiple smooths, the smooth variable uses the internal label that {mgcv} gives each smooth - these are the labels used in the output you get from calling summary() on your GAM.对于具有多个平滑的模型, smooth变量使用 {mgcv} 为每个平滑提供的内部标签 - 这些是您在 GAM 上调用summary()获得的输出中使用的标签。

We can add a confidence interval if needed using add_confint() .如果需要,我们可以使用add_confint()添加置信区间。

Now you can plot your smooth(s) by hand using ggplot() .现在您可以使用ggplot()手动绘制平滑图。 At this point you have two options此时你有两个选择

  1. if draw() can handle the type of smooth you want to plot, you can use the draw() method for that object and then build upon it, or如果draw()可以处理您想要绘制的平滑类型,则可以对该对象使用draw()方法,然后在其基础上进行构建,或者
  2. plot everything by hand.手工绘制一切。

Option 1选项1

# evaluate just the smooth you want to plot
smooth_estimates(m, smooth = "te(x,y)", dist = 0.1) %>%
draw() +
  geom_point(data = df, alpha = 0.2) # add a point layer for original data

This pretty much gets you what draw() produced when given the model object itself.当给定模型对象本身时,这几乎可以为您提供draw()生成的内容。 And you can add to it as if it were a ggplot object (which is not the case of the objects returned by gratia:::draw.gam() , which is wrapped by {patchwork} and needs other ways to interact with the plots).您可以将其添加为ggplot对象( gratia:::draw.gam()返回的对象不是这种情况,它由 {patchwork} 包装并且需要其他方式与绘图交互)。

Option 2选项 2

Here you are in full control在这里您可以完全控制

sm <- smooth_estimates(m, smooth = "te(x,y)", dist = 0.1)
ggplot(sm, aes(x = x, y = y)) +
  geom_raster(aes(fill = est)) +
  geom_point(data = df, alpha = 0.2) + # add a point layer for original data
  scale_fill_viridis_c(option = "plasma")

which produces产生

在此处输入图片说明

A diverging palette is likely better for this, along the lines of the one gratia:::draw.smooth_estimates uses一个发散的调色板可能会更好,沿着一个gratia:::draw.smooth_estimates使用

sm <- smooth_estimates(m, smooth = "te(x,y)", dist = 0.1)
ggplot(sm, aes(x = x, y = y)) +
  geom_raster(aes(fill = est)) +
  geom_contour(aes(z = est), colour = "black") +
  geom_point(data = df, alpha = 0.2) + # add a point layer for original data
  scale_fill_distiller(palette = "RdBu", type = "div") +
  expand_limits(fill = c(-1,1) * abs(max(sm[["est"]])))

which produces产生

在此处输入图片说明

Finally, if {gratia} can't handle your model, I'd appreciate you filing a bug report here so that I can work on supporting as many model types as possible.最后,如果 {gratia} 无法处理您的模型,我很感谢您在此处提交错误报告以便我能够支持尽可能多的模型类型。 But do try {mgcViz} as well for an alternative approach to visualsing GAMs fitted using {mgcv}.但是也请尝试使用 {mgcViz} 来获取使用 {mgcv} 拟合的可视化 GAM 的替代方法。

Following Gavin Simpson's answer and this thread ( How to add colorbar with perspective plot in R ), I think I've come up with a solution that uses plot.gam() (though I really love that {gratia} takes it into a ggplot universe and will definitely look more into that)遵循 Gavin Simpson 的回答和这个线程( 如何在 R 中添加带有透视图的plot.gam() ),我想我想出了一个使用plot.gam()的解决方案(尽管我真的很喜欢 {gratia} 将它带入 ggplot宇宙,并且肯定会对此进行更多研究)

require(fields)
df.gam<-gam(y~te(x,z), data=df, method='REML')
sm <- as.data.frame(smooth_estimates(df.gam, dist = 0.1))
plot(df.gam, scheme=2, hcolors=heat.colors(999, rev =T), contour.col='black', rug=F,  main='', cex.lab=1.75, cex.axis=1.75)
image.plot(legend.only=T, zlim=range(sm$est), col=heat.colors(999, rev =T), legend.shrink = 0.5, axis.args = list(at =c(-10,-5,0,5, 10, 15, 20)))

带颜色条的 GAM 热图

I hope I understood correctly that gratia:smooth_estimates() actually pulls out the partial effects.我希望我能正确理解gratia:smooth_estimates()实际上会消除部分效果。

For my model with multiple terms (and multiple tensor products), this seems to work nicely by indexing the sections of the respective terms in sm .对于我的具有多个术语(和多个张量积)的模型,通过在sm索引各个术语的部分,这似乎可以很好地工作。 Except for one, where the colorbar and the heatmap aren't quite matching up.除了一个,颜色条和热图不太匹配。 I can't provide the actual underlaying data, but add that plot for illustration in case anyone has any idea.我无法提供实际的底层数据,但可以添加该图以供说明,以防万一有人有任何想法。 I'm using the same approach as outlined above.我正在使用与上述相同的方法。 In the colorbar, dark red is at 15-20, but in the heatmap the isolines just above 0 already correspond with the dark red (while 0 is dark yellow'ish in the colorbar).在颜色条中,深红色位于 15-20,但在热图中,刚好在 0 上方的等值线已经与深红色对应(而 0 在颜色条中是深黄色)。

热图和颜色条不匹配

A base plot solution would be to use fields::image.plot directly.基本绘图解决方案是直接使用fields::image.plot Unfortunately, it require data in a classic wide format, not the long format needed by ggplot.不幸的是,它需要经典宽格式的数据,而不是 ggplot 所需的长格式。

We can facilitate plotting by grabbing the object returned by plot.gam() , and then do a little manipulation of the object to get what we need for image.plot()我们可以通过抓取plot.gam()返回的对象来促进绘图,然后对对象进行一些操作以获得我们需要的image.plot()

Following on from @Anke's answer then, instead of plotting with plot.gam() then using image.plot() to add the legend, we proceed to use plot.gam() to get what we need to plot, but do everything in image.plot()plot.gam()的回答之后,我们不再使用plot.gam()绘图然后使用image.plot()添加图例,而是继续使用plot.gam()来获取我们需要绘制的内容,但在image.plot()

plt <- plot(df.gam)
plt <- plt[[1]] # plot.gam returns a list of n elements, one per plot

# extract the `$fit` variable - this is est from smooth_estimates
fit <- plt$fit
# reshape fit (which is a 1 column matrix) to have dimension 40x40
dim(fit) <- c(40,40)
# plot with image.plot
image.plot(x = plt$x, y = plt$y, z = fit, col = heat.colors(999, rev = TRUE))
contour(x = plt$x, y = plt$y, z = fit, add = TRUE)
box()

This produces:这产生:

在此处输入图片说明

You could also use the fields::plot.surface() function您还可以使用fields::plot.surface()函数

l <- list(x = plt$x, y = plt$y, z = fit)
plot.surface(l, type = "C", col = heat.colors(999, rev = TRUE))
box()

This produces:这产生:

在此处输入图片说明

See ?fields::plot.surface for other arguments to modify the contour plot etc.有关修改等高线图等的其他参数,请参阅?fields::plot.surface

As shown, these all have the correct range on the colour bar.如图所示,这些在颜色条上都有正确的范围。 It would appear that @Anke's version the colour bar mapping is off in all of the plots, but mostly just a little bit so it wasn't as noticeable.看起来@Anke 的版本在所有图中都关闭了颜色条映射,但大多数情况下只是一点点,所以它并不那么明显。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM