简体   繁体   中英

How to interpret the height of a cluster based on a correlation matrix?

I'm creating a cluster based on a symmetrical correlation matrix. This matrix has values from 0 to 1.

docs <- dist(as.matrix(data), method = "euclidean")
hclust_dist<- as.dist(docs)
hclust_dist[is.na(hclust_dist)] <- 0
hclust_dist[is.nan(hclust_dist)] <- 0
sum(is.infinite(hclust_dist))  # THIS SHOULD BE 0
h <- hclust(hclust_dist, "ward.D2")
plot(h, cex=0.6)

When I plot I got this cluster:

聚类树状图

I wish to divide the cluster into different groups with a correlation score threshold of 0.7. Which means that the units in the same group share a correlation score of minimum 0.7.

However, my values of height go from 0 to 30.

Anyone knows how do I interpret this height to convert it into a correlation score from 0 to 1?

Or, do I need to use a different clustering method?

I've found a possible solution.

I tried the correlation cluster instead of the one I was using with this code:

data= read.csv(file="individuo21.csv", sep =";", header = T, row.names = 1)

dissimilarity= 1 - data
distance = as.dist(dissimilarity) 
h<-(hclust(distance))
plot(h, cex=0.3)

groups <- cutree(h, h=0.70) 
View(groups)

I've got a cluster with a height from 0 to 1 like the correlation score.

CLuster obtained from correlation matrix

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM