簡體   English   中英

標簽和顏色葉子樹狀圖

[英]Label and color leaf dendrogram

我正在嘗試創建一個樹形圖,我的樣本有5個組代碼(作為樣本名稱/種類/等,但重復)。

因此,我有兩個問題,幫助將是偉大的:

  • 如何在葉標簽中顯示組代碼(而不是樣本編號)?

  • 我希望為每個代碼組分配一種顏色並根據它對葉子標簽着色(可能會發生它們不在同一個分支中,我可以找到更多信息)?

是否可以使用我的腳本執行此操作(ape或ggdendro):

sample<-read.table("C:/.../DOutput.txt", header=F, sep="")
groupCodes <- sample[,1]
sample2<-sample[,2:100] 
d <- dist(sample2, method = "euclidean")  
fit <- hclust(d, method="ward")
plot(as.phylo(fit), type="fan") 
ggdendrogram(fit, theme_dendro=FALSE)  

隨機數據框替換我的read.table:

sample = data.frame(matrix(floor(abs(rnorm(20000)*100)),ncol=200))
groupCodes <- c(rep("A",25), rep("B",25), rep("C",25), rep("D",25)) # fixed error
sample2 <- data.frame(cbind(groupCodes), sample) 

以下是使用名為“ dendextend ”的新軟件包解決此問題的方法,該軟件包完全針對此類內容構建。

您可以在包的演示文稿和插圖中看到許多示例,位於以下URL的“用法”部分: https//github.com/talgalili/dendextend

以下是此問題的解決方案:(注意如何重新排序顏色以首先擬合數據,然后適應樹形圖的新順序的重要性)

####################
## Getting the data:

sample = data.frame(matrix(floor(abs(rnorm(20000)*100)),ncol=200))
groupCodes <- c(rep("Cont",25), rep("Tre1",25), rep("Tre2",25), rep("Tre3",25))
rownames(sample) <- make.unique(groupCodes)

colorCodes <- c(Cont="red", Tre1="green", Tre2="blue", Tre3="yellow")

distSamples <- dist(sample)
hc <- hclust(distSamples)
dend <- as.dendrogram(hc)

####################
## installing dendextend for the first time:

install.packages('dendextend')

####################
## Solving the question:

# loading the package
library(dendextend)
# Assigning the labels of dendrogram object with new colors:
labels_colors(dend) <- colorCodes[groupCodes][order.dendrogram(dend)]
# Plotting the new dendrogram
plot(dend)


####################
## A sub tree - so we can see better what we got:
par(cex = 1)
plot(dend[[1]], horiz = TRUE)

在此輸入圖像描述

您可以將hclust對象轉換為dendrogram並使用?dendrapply來修改每個節點的屬性(如顏色,標簽等屬性),例如:

## stupid toy example
samples <- matrix(c(1, 1, 1,
                    2, 2, 2,
                    5, 5, 5,
                    6, 6, 6), byrow=TRUE, nrow=4)

## set sample IDs to A-D
rownames(samples) <- LETTERS[1:4]

## perform clustering
distSamples <- dist(samples)
hc <- hclust(distSamples)

## function to set label color
labelCol <- function(x) {
  if (is.leaf(x)) {
    ## fetch label
    label <- attr(x, "label") 
    ## set label color to red for A and B, to blue otherwise
    attr(x, "nodePar") <- list(lab.col=ifelse(label %in% c("A", "B"), "red", "blue"))
  }
  return(x)
}

## apply labelCol on all nodes of the dendrogram
d <- dendrapply(as.dendrogram(hc), labelCol)

plot(d)

在此輸入圖像描述

編輯:為您的最小示例添加代碼:

    sample = data.frame(matrix(floor(abs(rnorm(20000)*100)),ncol=200))
groupCodes <- c(rep("A",25), rep("B",25), rep("C",25), rep("D",25))

## make unique rownames (equal rownames are not allowed)
rownames(sample) <- make.unique(groupCodes)

colorCodes <- c(A="red", B="green", C="blue", D="yellow")


## perform clustering
distSamples <- dist(sample)
hc <- hclust(distSamples)

## function to set label color
labelCol <- function(x) {
  if (is.leaf(x)) {
    ## fetch label
    label <- attr(x, "label")
    code <- substr(label, 1, 1)
    ## use the following line to reset the label to one letter code
    # attr(x, "label") <- code
    attr(x, "nodePar") <- list(lab.col=colorCodes[code])
  }
  return(x)
}

## apply labelCol on all nodes of the dendrogram
d <- dendrapply(as.dendrogram(hc), labelCol)

plot(d)

在此輸入圖像描述

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM