简体   繁体   English

使用R绘图用树状图绘制聚类热图

[英]Plotting a clustered heatmap with dendrograms using R's plotly

I'm following this example on how to create a clustered heatmap with dendrograms with R 's plotly . 我正在按照这个例子说明如何使用Rplotly创建具有树状图的聚类热图。 Here's the example: 这是一个例子:

library(ggplot2)
library(ggdendro)
library(plotly)

#dendogram data
x <- as.matrix(scale(mtcars))
dd.col <- as.dendrogram(hclust(dist(x)))
dd.row <- as.dendrogram(hclust(dist(t(x))))
dx <- dendro_data(dd.row)
dy <- dendro_data(dd.col)

# helper function for creating dendograms
ggdend <- function(df) {
  ggplot() +
    geom_segment(data = df, aes(x=x, y=y, xend=xend, yend=yend)) +
    labs(x = "", y = "") + theme_minimal() +
    theme(axis.text = element_blank(), axis.ticks = element_blank(),
          panel.grid = element_blank())
}

# x/y dendograms
px <- ggdend(dx$segments)
py <- ggdend(dy$segments) + coord_flip()

# heatmap
col.ord <- order.dendrogram(dd.col)
row.ord <- order.dendrogram(dd.row)
xx <- scale(mtcars)[col.ord, row.ord]
xx_names <- attr(xx, "dimnames")
df <- as.data.frame(xx)
colnames(df) <- xx_names[[2]]
df$car <- xx_names[[1]]
df$car <- with(df, factor(car, levels=car, ordered=TRUE))
mdf <- reshape2::melt(df, id.vars="car")
p <- ggplot(mdf, aes(x = variable, y = car)) + geom_tile(aes(fill = value))

mat <- matrix(unlist(dplyr::select(df,-car)),nrow=nrow(df))
colnames(mat) <- colnames(df)[1:ncol(df)-1]
rownames(mat) <- rownames(df)

# hide axis ticks and grid lines
eaxis <- list(
  showticklabels = FALSE,
  showgrid = FALSE,
  zeroline = FALSE
)

p_empty <- plot_ly(filename="r-docs/dendrogram") %>%
  # note that margin applies to entire plot, so we can
  # add it here to make tick labels more readable
  layout(margin = list(l = 200),
         xaxis = eaxis,
         yaxis = eaxis)

subplot(px, p_empty, p, py, nrows = 2, margin = 0.01)

which gives: 这使:

在此输入图像描述

I changed the code a bit so that in my case the heatmap is generated with plotly rather than ggplot since it runs faster on my real big data, hence I do: 我稍微更改了代码,以便在我的情况下,热图是使用plotly而不是ggplot生成的,因为它在我的真实大数据上运行得更快,因此我做:

heatmap.plotly <- plot_ly() %>% add_heatmap(z=~mat,x=factor(colnames(mat),lev=colnames(mat)),y=factor(rownames(mat),lev=rownames(mat)))

And then: 然后:

subplot(px, p_empty, heatmap.plotly, py, nrows = 2, margin = 0.01)

which gives: 这使: 在此输入图像描述

My questions are: 我的问题是:

  1. How do I get the row and column labels of the heatmap not get cut off as they do in both plots? 如何使热图的行标签和列标签不会像在两个图中那样被切断?

  2. The label of the colorer is changed to "mat" in the second figure. 在第二张图中,colorer的标签更改为“mat”。 Any idea how to prevent that? 知道怎么预防吗?

  3. How do I change the margins between the heatmap and the dendrograms? 如何更改热图和树形图之间的边距?

Making a fully working cluster heatmap with plotly is not as simple as it may seem in the beginning. 使用plotly制作完全工作的群集热图并不像开头那样简单。 Luckily, there is an R package called heatmaply which does just that. 幸运的是,有一个名为heatmaply的R包可以做到这一点。 You can see many examples of features in the online vignette . 您可以在在线插图中看到许多功能示例。

For example: 例如:

install.packages("ggplot2")
install.packages("plotly")
install.packages("heatmaply")

library(heatmaply)
heatmaply(scale(mtcars), k_row = 3, k_col = 2)

在此输入图像描述

This figure is fully interactive (both from the heatmap and the dendrogram). 该图是完全交互式的(来自热图和树形图)。 Notice that it uses dendextend (a more developed version of ggdendro, which also can, just for example, account for branch colors/line-type/line-width) 请注意,它使用dendextendggdendro的更高级版本,也可以,例如,考虑分支颜色/线型/线宽)

Specifically setting the margins of the dendrograms is an open issue (from just today), but this will probably get resolved soon. 专门设置树形图的边距是一个悬而未决的问题(从今天开始),但这可能很快就会得到解决。

How do I get the row and column labels of the heatmap not get cut off > as they do in both plots? 如何使热图的行标签和列标签不被切断>就像在两个图中一样?

Try setting the margin s after the plot was generated 在生成绘图后尝试设置margin

sply <- subplot(px, p_empty, heatmap.plotly, py, nrows = 2)
sply <- layout(sply,
               margin = list(l = 150,
                             r = 0,
                             b = 50,
                             t = 0
                            )
               )

The label of the colorer is changed to "mat" in the second figure. 在第二张图中,colorer的标签更改为“mat”。 Any idea how to prevent that? 知道怎么预防吗?

No idea how to prevent it but you can overwrite the label. 不知道如何防止它,但你可以覆盖标签。

sply$x$data[[3]]$colorbar$title <- 'mat'

How do I change the margins between the heatmap and the dendrograms? 如何更改热图和树形图之间的边距?

You can specify the domain for each axis of each subplot. 您可以为每个子图的每个轴指定domain yaxis corresponds to the plot in the left upper corner, yaxis2 to the plot in right next to it, etc. yaxis对应左上角的图, yaxis2对应于它旁边的图,等等。

Increasing the distance works better than decreasing it. 增加距离比减少距离更有效。

sply <- layout(sply,
               yaxis = list(domain=c(0.47, 1)),
               xaxis = list(domain=c(0, 0.5)),
               xaxis3 = list(domain=c(0, 0.5)),
               xaxis4 = list(domain=c(0.5, 1)),
               )

在此输入图像描述

pl <- subplot(px, p_empty, p, py, nrows = 2)
heatmap.plotly <- plot_ly() %>% add_heatmap(z=~mat,x=factor(colnames(mat),lev=colnames(mat)),y=factor(rownames(mat),lev=rownames(mat)))
sply <- subplot(px, p_empty, heatmap.plotly, py, nrows = 2)
sply$x$data[[3]]$colorbar$title <- 'mat'
sply <- layout(sply,
               yaxis = list(domain=c(0.47, 1)),
               xaxis = list(domain=c(0, 0.5)),
               xaxis3 = list(domain=c(0, 0.5)),
               xaxis4 = list(domain=c(0.5, 1)),
               margin = list(l = 150,
                             r = 0,
                             b = 50,
                             t = 0
                             )


               )

sply

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM