简体   繁体   English

像包装猿的plot.phylo一样更改plot.dendrogram中的叶子颜色

[英]Change leaf color in plot.dendrogram like with plot.phylo of package ape

I am trying to plot the result of agglomerative clustering (UPGMA with Agnes) in the same 'style' as when plotting a tree using the package 'ape'. 我正在尝试以与使用包“猿”绘制树时相同的“样式”来绘制聚集聚类的结果(UPGMA与Agnes)。 A simple example I include in the figure below 我在下图中包含一个简单的示例 图1.所需最终输出的简单示例

The key issue is that I want to be able to color the leaves of the dendrogram based on the a pattern in the labels of the leaves. 关键问题是我希望能够根据叶子标签中的图案为树状图的叶子着色。 I tried two approaches: either I used hc2Newick or I used the code by Joris Meys as proposed in an answer to Change Dendrogram leaves . 我尝试了两种方法:要么我使用了hc2Newick要么我使用了Joris Meys的代码,该代码是对Change Dendrogram leaves的回答中提出的。 Both did not give a satisfactory output. 两者均未提供令人满意的输出。 It might be that I do not fully understand the way the dendrograms are constructed either. 我可能也不完全了解树状图的构造方式。 An ASCII save of the abundance.agnes.ave object (stored from running agnes) can be found on https://www.dropbox.com/s/gke9qnvwptltkky/abundance.agnes.ave . 您可以在https://www.dropbox.com/s/gke9qnvwptltkky/abundance.agnes.ave上找到abundance.agnes.ave对象(通过运行agnes存储)的ASCII保存。

When I use the first option (with hc2Newick from bioconductor's ctc package) I get the following figure when using this code: 当我使用第一个选项(与hc2Newickctc软件包中的hc2Newick一起使用)时,使用此代码时,我得到下图:

write(hc2Newick(as.hclust(abundance.agnes.ave)),file="all_samples_euclidean.tre")
eucltree<-read.tree(file="all_samples_euclidean.tre")
eucltree.laz<-ladderize(eucltree,FALSE)
tiplabs<-eucltree$tip.label
numbertiplabs<-length(tiplabs)
colourtips<-rep("green",numbertiplabs)
colourtips[grep("II",tiplabs)]<-"red"
plot(eucltree.laz,tip.color=colourtips,adj=1,cex=0.6,use.edge.length=F)
add.scale.bar()

使用plot.phylo

This is obviously not ideal, the 'alignment' of the plot is not as I wanted. 这显然是不理想的,情节的“对齐”不是我想要的。 I supose this has to do with the branch length calculation however I do not have the foggiest idea how to solve this issue. 我认为这与分支长度的计算有关,但是我不知道如何解决此问题。 Certainly when compared to the results of the colLab function, which look more like the dendrogram style I'd like to report. 当然,与colLab函数的结果相比,它看起来更像我要报告的树状图样式。 Also, using use.edge.length=T in the code above does give a clustering that is not 'aligned' properly: 另外,在上面的代码中使用use.edge.length=T确实会产生未正确“对齐”的聚类: 具有分支长度的Plot.phylo

The second approach using Joris Meys' colLab function with the following code gives the next figure 使用Joris Meys的colLab函数和以下代码的第二种方法给出下图

clusDendro<-as.dendrogram(as.hclust(abundance.agnes.ave))
labelColors<-c("red","green")
clusMember<-rep(1,length(rownames(abundance.x)))
clusMember[grep("II",rownames(abundance.x))]<-2
names(clusMember)<-rownames(abundance.x)

colLab <- function(n)
{
  if(is.leaf(n)) {
    a <- attributes(n)
    # clusMember - a vector designating leaf grouping
    # labelColors - a vector of colors for the above grouping
    labCol <- labelColors[clusMember[which(names(clusMember) == a$label)]]
    attr(n, "nodePar") <- c(a$nodePar, lab.col = labCol)
  }
  n
}

clusDendro<-dendrapply(clusDendro, colLab)
plot(clusDendro,horiz=T,axes=F)

使用colLab This plot is getting closer to what I want, however I do not know why the open circles appear at the leaves and how to remove them. 该图越来越接近我想要的图,但是我不知道为什么空心圆出现在叶子上以及如何将其删除。

Any help is much appreciated. 任何帮助深表感谢。

Kind regards, 亲切的问候,

FM 调频

This functionality is now available in a new package called " dendextend ", built exactly for this sort of thing. 现在,此功能可在名为“ dendextend ”的新软件包中使用,该软件包正是针对此类情况而构建的。

You can see many examples in the presentations and vignettes of the package, in the "usage" section in the following URL: https://github.com/talgalili/dendextend 在以下URL的“用法”部分中,您可以在该包的演示文稿和小插图中看到许多示例: https : //github.com/talgalili/dendextend

An almost exact question was just answered in the following SO question: 下列SO问题仅回答了一个几乎准确的问题:

https://stackoverflow.com/a/18832457/256662 https://stackoverflow.com/a/18832457/256662

I wrote that code quite a while ago, and it appears there's something changed a little in the mechanism. 我在很早以前就编写了该代码,但似乎该机制有些变化。

The plot.dendrogram function I used, has an argument nodePar . 我使用的plot.dendrogram函数具有一个参数nodePar The behaviour has changed since the last time I used that function, and although that's used normally for the inner nodes, it apparently has an effect on the outer nodes as well. 自从我上次使用该函数以来,行为已经发生了变化,尽管该函数通常用于内部节点,但显然对外部节点也有影响。 The default value for pch is 1:2 now, according to the help files. 根据帮助文件, pch的默认值现在为1:2

Hence, you need to specifically specify pch=NA in the attributes you add to the outer nodes in the colLab function. 因此,您需要在添加到colLab函数的外部节点的属性中专门指定pch=NA Try adapting it like this: 尝试像这样调整它:

colLab <- function(n)
{
  if(is.leaf(n)) {
    a <- attributes(n)
    # clusMember - a vector designating leaf grouping
    # labelColors - a vector of colors for the above grouping
    labCol <- labelColors[clusMember[which(names(clusMember) == a$label)]]

    attr(n, "nodePar") <- 
        if(is.list(a$nodePar)) c(a$nodePar, lab.col = labCol,pch=NA) else
                               list(lab.col = labCol,pch=NA)
  }
  n
}

On my machine, that solves the problem. 在我的机器上,可以解决问题。

Alternatively, you could take a look at the argument use.edge.length of the function plot.phylo in the ape package. 另外,您可以查看ape包中函数plot.phylo的参数use.edge.length You set it to FALSE , but from your explanation I believe you want it to be set on the default, TRUE . 您将其设置为FALSE ,但是根据您的解释,我相信您希望将其设置为默认值TRUE

EDIT: In order to make the function more generic, it might be a good idea to add labelColors and clusMember as arguments to the function. 编辑:为了使该函数更通用,最好将labelColorsclusMember添加labelColors函数的参数。 My quick-n-dirty solution isn't the best example of clean code... 我的快速处理方法不是干净代码的最佳示例。

Also forget what I said about using the edge length. 也忘了我所说的使用边长的话。 the ape package interpretes it as a real dendrogram and putting use.edge.length to TRUE will convert the edge lengths to evolution time. ape程序包将其解释为真实的树状图,并将use.edge.lengthTRUE会将边长转换为进化时间。 Hence the 'weird' outlining of the dendrogram. 因此,树状图的“怪异”概述。

Also note that in case the treeleafs don't have a nodePar attribute, adding extra parameters using the c() function will lead to undesired effects: if you add eg lab.cex=0.6 , the c() function will create a vector instead of a list, and convert the value for lab.cex to character whenever there's a character value in the parameter list. 还要注意,如果树形叶没有nodePar属性,则使用c()函数添加额外的参数会导致不希望的效果:如果添加例如lab.cex=0.6 ,则c()函数将创建一个矢量列表,然后在参数列表中存在字符值时,将lab.cex的值转换为字符。 In this case that's going to be the name of the color, and that explains the error you talk about in the comment. 在这种情况下,它将成为颜色的名称,并解释了您在注释中谈论的错误。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM