简体   繁体   中英

generating high-resolution dendrogram plot in R

I am trying to generate a high-resolution dendrogram in R.

The difficulty is that there are more than 200 leaf nodes, and each node is identified by a string. I would like to ensure that each of these string labels is readable in the generated (printed) plot.

Another thing is that I would like to switch the original x-axis (corresponding to leaf nodes) to the y-axis, and switch the original y-axis to x-axis. For more clear demonstration purposes, I would like to add one more x-axis (which corresponds to the distance information in the switched plot) on the top of the plot. How can one do this in R?

You can achieve this with standard R functions.

Plot a dendrogram

To plot a dendrogram from a distance matrix you can use the hclust function. See its man page for further details on the algorithms available.

# To produce a dummy distance matrix
distMatrix <- dist(matrix(1:9, ncol=3))

# To convert it into a tree
tree <- hclust(distMatrix)

For the plot, the dendrogram class provides a useful plot method. Just convert the hclust output to dendrogram and plot it :

dendro <- as.dendrogram(tree)

This method provides a horiz argument that can switch X and Y axis, test the following :

plot(dendro, horiz=TRUE)
plot(dendro, horiz=FALSE)

Manage its size

For the readability, it is up to the device you use for exporting the image. R can produce huge images, it is up to the user to set the size and resolution. See the man page for png or pdf for further details (width, height and res are interesting arguments).

An other track to follow is the graphical parameters : playing with the various cex values, you will be able to resize the labels. See the man page of par for further details.

Readability is quite human oriented, so i don't think you will find an automated way to obtain a readable plot automaticaly, but with a few manual tunning you can achieve it with the tools i mentionned. If automation is mandatory, it can be obtained using some par elements generated by R like cin to predict the needed device width, but it is much simpler to tune it manually.

New axis

The axis function can help you.

Took me a while to get this:

# get font factor
pdf(); ff<-72/par()$ps; dev.off();
# if there are more than 20 entries 
if (dim(x)[2] > 20) {
    # scale output by font size
    pdf(fout, height=dim(x)[2]/ff)
} else {
    pdf(fout)
}
# increase right margin width
op <- par(mar = par("mar") + c(0,0,0,2*max(nchar(colnames(x)))/ff))
# plot horizontally
plot(as.dendrogram(hclust(distance), hang=-1), main="Dissimilarity = 1 - Correlation", xlab="", horiz=T)
# restore margin
par(op)
dev.off();

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM