简体   繁体   中英

R rect.hclust: rectangles too high in dendogram

I asked a number of different experts to sort 92 objects based on their similarity. Based on their answers, I constructed a 92 x 92 dissimilarity matrix. in R, I examined this matrix using the following commands:

cluster1 <- hclust(as.dist(DISS_MATRIX), method = "average") 
plot(cluster1, cex=.55)

To highlight the clusters, I wanted to draw rectangles around them:

rect.hclust(cluster1, k = 3, border = "red")

The result is as follows:

在此处输入图片说明

However, when the objects have longer names ("AAAAAAAAAAAAAAAA43" instead of "A43") then the formating is off:

rownames(DISS_MATRIX) <- paste0(rep("AAAAAAAAAAAAAAAAAAAAAAAAAAAA",92),1:92)
colnames(DISS_MATRIX) <- paste0(rep("AAAAAAAAAAAAAAAAAAAAAAAAAAAA",92),1:92)
cluster1 <- hclust(as.dist(DISS_MATRIX), method = "average") 
plot(cluster1, cex=.55)
rect.hclust(cluster1, k = 3, border = "red")

This can be seen by the resulting dendogram.

在此处输入图片说明

The rectangles seem to have moved up to the end of the dendogram. Not nice. I assume this glitch must have been due to the long names of 92 objects in the dissimilarity matrix. It may also not seem very relevant. Just make sure your objects have names short enough.

However, due to different reasons I want my objects to have their original (ieadmittedly long) names. This graph is for a presentation and thus I do not want to work with codes. I also do not want to use any other package since I generally find hclust quite easy to use. However, I do not find any way to position rectangles within the rect.hclust command. Hence, what can I do to position the rectangles into the dendogram even if object names are long? Thanks.

You wrote that "I also do not want to use any other package since I generally find hclust quite easy to use."

While hclust is great for creating the hierarchical clustering object it does not support much in terms of plotting. Once you have the hclust output, it is better to change it to dendrogram (using as.dendrogram ) for visualizations (since it is better suited for that). There is no way to do what you want without using sophisticated code, which is packed in a package, this is the best route (IMHO) for you to move forward. (I know because I wrote rect.dendrogram, and it took a lot of work to get it to work the way you want it)

The dendextend R package allows many functions for manipulating and visualizing dendrograms (see the vignette here ). Specifically, the rect.dendrogram function can handle such cases as you asked about (with having long labels). For example (I've added color_branches and color_labels for the fun of it):

library(dendextend)
hc <- mtcars[, c("mpg", "disp")] %>% dist %>% hclust(method = "average") 
dend <- hc %>% as.dendrogram %>% hang.dendrogram
# let's make the text longer
labels(dend)[1] <- "AAAAAAAAAAAAAAAAAAAAA"

par(mar = c(15,2,1,1))
dend %>% color_branches(k=3) %>% color_labels(k=3) %>% plot
dend %>% rect.dendrogram(k=3)

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM