简体   繁体   中英

Clustering using gower distance in R

I have a dataframe which has categorical and numeric variables. I want to cluster this data using gower distance and get cluster values as a vector as in kmeans function. How can i achieve that?

You can use the vegan package to generate your gower matrix, and then create your clusters using the cluster package.

gow.mat <- vegdist(dataframe, method="gower")

Then you can feed that matrix into the PAM function. The example below will use the gower distance to generate 5 clusters

clusters <- pam(x = gow.mat, k = 5, diss = TRUE)

You can then get your cluster information from

clusters$clustering

You can use kproto() function from clustMixType if you don't want to insist on using Gower distance. The distance measure in kproto is similar to Gower distance except that kproto uses Euclidean distance to measure dissimilarity between numerical variables; however, Gower distance normalizes each variable (divides the distance between two observations by the range of that variable). The code is pretty simple.

kproto_clustering <- kproto(df, k)   # k is number of cluster
clusters <- kproto_clustering$cluster

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM