![](/img/trans.png)
[英]How to highlight the centers of particular clusters(K-means Clustering) in graph using plotly R?
[英]How to edit own k-means function so that it takes clusters as input instead of centers in R?
如何編輯此函數以將“k”(聚類數)作為輸入而不是當前情況下的中心? 代碼如下:
# Calculates Eudlidean distance
euclid <- function(points1, points2) {
distanceMatrix <- matrix(NA, nrow=dim(points1)[1], ncol=dim(points2)[1])
for(i in 1:nrow(points2)) {
distanceMatrix[,i] <- sqrt(rowSums(t(t(points1)-points2[i,])^2))
}
distanceMatrix
}
# k-means algorithm
k_means = function(x, centers, distFun) {
prevClusters = NULL
prevCenters = NULL
repeat {
distsToCenters = distFun(x, centers)
clusters = apply(distsToCenters, 1L, which.min)
centers = apply(x, 2L, tapply, clusters, mean) # If I replace 'mean' here with 'centroid', error comes
if (identical(prevClusters, clusters)) break
prevClusters = clusters
prevCenters = centers
}
list(clusters = clusters, centers = centers)
}
test=data # A data.frame
ktest=as.matrix(test) # Turn into a matrix
centers <- ktest[sample(nrow(ktest), 5),] # Sample some centers, 5 for example
res <- k_means(ktest, centers, euclid)
print(res)
使用數據矩陣作為輸入時的結果是多個集群,后跟它們的中心。 是否可以對其進行編輯,以便輸入所需的聚類數而不是所需的中心數? 即如何定義“集群”以便它可以用作輸入?
首先,我建議您不要重新發明輪子,因為 R 提供了開箱即用的kmeans
實現。 但是,如果在您的函數中只為您提供了集群的數量,您可以在數據范圍內隨機選擇點。 就像是:
if (length(centers)==1) {
k<-as.integer(centers)
extrema<-apply(x,2,range)
centers<-apply(extrema,2,function(.x) runif(k,.x[1],.x[2]))
}
rigth 在函數的開頭。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.