简体   繁体   中英

How to perform hierarchical clustering with predifined clusters/classes?

I have a database I did hierarchical clustering on (with agnes() ) and it worked well (I did it like here described: https://uc-r.github.io/hc_clustering . Now I want to compare manmade clusters or classes in the database with the ones that the hierarchical clustering found. I think I can do this with tanglegram() . I do not know how to generate a dendrogram/ doing hierarchical clustering when I already have groups. How can I tell R about the groups? It would be great if you could answer this question methodical. `

set.seed(73)
great <- data.frame(c0=c("r1","r2","r3","r4","r5","r6"),c1=c("0.89","46","0","0.56","12","0"),c2=c("0","0.45","45","79","0.45","4.4"))

#euclidean distance

great_dist <- dist(great)

#agglomerative with agnes()
#wards minimizes total within cluster variance
#minimum between-cluster-distance is merged

hc1_wards <- agnes(great,method ="ward")
 #agglomerative coefficient
hc1_wards$ac

hc1_wards_plot <- pltree(hc1_wards, cex = 0.6, hang = -1, main = "Dendrogram\nagglomerative clustering",labels=F) 

#cutting into a specific amount of clusters

#average silhouette method

fviz_nbclust(great, FUN = hcut, method = "silhouette")

# Cut tree into 2 groups

great_grp <-
agnes(great, method = "ward")
great_grp_cut <- cutree(as.hclust(great), k = 2)

#using the cutree output to add the cluster each observation belongs to sub

great_cluster <- mutate(great,cluster = great_grp_cut)  


#evaluating goodness of cluster with dunn()
#with count() how many obs. in one cluster

count(great_cluster,cluster)

dunn <- clValid::dunn(distance = great_dist,clusters = great_grp_cut)

`

The lines 1,2,4 und 3,5,6 are manmade clusters of great.

cl1 <- great[c(1,2,4), ]
cl2 <- great[c(3,5,6, ]

I want to compare the hierarchical clustering and manmade clustering. How can I perform a dendrogram with the manmade clustering in order to compare them with tenglegram() . Is there another way to compare them?

To compare the clusters visually you can use plotDendroAndColors() function from WGCNA package. The function simply displays custom color information for each object under the dendrogram.

I cannot reproduce your example (the packages you used in your code are not specified), so I am demonstrating this using a simple clustering of iris dataset:

library(WGCNA)

fit     <- hclust(dist(iris[,-5]), method="ward")
groups  <- cutree(fit, 3)
manmade <- as.numeric(iris$Species)

plotDendroAndColors(fit, cbind(clusters=labels2colors(groups), manmade=labels2colors(manmade)))

集群

Since you are using some kind of third-party packages for clustering, you might have to first convert their objects to dendrograms for this plotting function to work. Maybe via:

fit <- as.dendrogram(hc1_wards)
plotDendroAndColors(fit, cbind(clusters=labels2colors(groups), manmade=labels2colors(manmade)))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM