繁体   English   中英

K-均值算法,R

[英]K-means algorithm, R

大家! 我被要求在R上创建K-means算法,但是我并不真正了解该语言,因此我在互联网上找到了一些示例代码,并决定使用它。 我研究了它,了解了其中使用的功能,并对其进行了一些纠正,因为它不能很好地工作。 这是代码:

# Creating a sample of data
y=rnorm(500,1.65)
x=rnorm(500,1.15)
x=cbind(x,y)
centers <- x[sample(nrow(x),5),]

# A function for calculating the distance between centers and the rest of the dots
euclid <- function(points1, points2) {
  distanceMatrix <- matrix(NA, nrow=dim(points1)[1], ncol=dim(points2)[1])
  for(i in 1:nrow(points2)) {
    distanceMatrix[,i] <- sqrt(rowSums(t(t(points1)-points2[i,])^2))
  }
  distanceMatrix
}


# A method function
K_means <- function(x, centers, euclid, nItter) {
  clusterHistory <- vector(nItter, mode="list")
  centerHistory <- vector(nItter, mode="list")

  for(i in 1:nItter) {
    distsToCenters <- euclid(x, centers)
    clusters <- apply(distsToCenters, 1, which.min)
    centers <- apply(x, 2, tapply, clusters, mean)
    # Saving history
    clusterHistory[[i]] <- clusters
    centerHistory[[i]] <- centers
  }

  structure(list(clusters = clusterHistory, centers = centerHistory))

}


res <- K_means(x, centers, euclid, 5)
#To use the same plot operations I had to use unlist, since the resulting object in my function is a list of lists,
#and default object is just a list. And also i store the history of each iteration in that object.
res <- unlist(res, recursive = FALSE)
plot(x, col = res$clusters5)
points(res$centers5, col = 1:5, pch = 8, cex = 2)

在这个简单的矩阵上工作正常。 但有人要求我在虹膜上使用它:

head(iris)
a <-data.frame(iris$Sepal.Length, iris$Sepal.Width, iris$Petal.Length, iris$Petal.Width)
centers <- a[sample(nrow(a),3),]
iris_clusters <- K_means(a, centers, euclid, 3)
iris_clusters <- unlist(iris_clusters, recursive = FALSE)
head(iris_clusters)

问题是它不起作用。 错误是:

Error in distanceMatrix[, i] <- sqrt(rowSums(t(t(points1) - points2[i,  : 
  number of items to replace is not a multiple of replacement length 

我知道对象的尺寸不匹配,但是我不明白为什么。 这就是为什么我要寻求帮助。 对于此代码中可能存在的所有愚蠢之处,我深表歉意,但是我对这种语言还不是很熟悉,所以不要认为我太苛刻。 谢谢!

您的实现应使用简单的类型转换

iris_clusters <- K_means(as.matrix(a), as.matrix(centers), euclid, 3) # 3 iterations

iris_clusters <- unlist(iris_clusters, recursive = FALSE)

# plotting the clusters obtained on the first two dimensions at the end of 3rd iteration

plot(a[,1:2], col = iris_clusters$clusters3, pch=19) 
points(iris_clusters$centers3, col = 1:5, pch = 8, cex = 2)

在此处输入图片说明

head(iris_clusters)

# cluster assignments and centroids computed at different iterations

$clusters1
  [1] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2 3 2 3 2 3 2 3 3 3 3 2 3 3 3 3 3 3 2 3 2 2 3 3
 [77] 2 2 3 3 3 3 3 2 3 3 2 3 3 3 3 2 3 3 3 3 3 3 3 3 1 2 1 2 1 1 3 1 1 1 2 2 2 2 2 2 2 1 1 2 1 2 1 2 1 1 2 2 2 1 1 1 2 2 2 1 2 2 2 2 1 2 2 1 1 2 2 2 2 2

$clusters2
  [1] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 3 2 3 3 2 2 2 3 2 2 2 2 3 2 2 2 2 2 2
 [77] 2 2 2 3 3 3 2 2 2 2 2 2 2 2 2 2 2 3 2 2 2 2 3 2 1 2 1 2 1 1 2 1 1 1 2 2 1 2 2 2 2 1 1 2 1 2 1 2 1 1 2 2 2 1 1 1 2 2 2 1 2 2 2 1 1 2 2 1 1 2 2 2 2 2

$clusters3
  [1] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 [77] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 2 2 2 2 3 2 1 2 1 2 1 1 2 1 1 1 2 2 1 2 2 2 2 1 1 2 1 2 1 2 1 1 2 2 1 1 1 1 1 2 2 1 1 2 2 1 1 1 2 1 1 1 2 2 2 2

$centers1
  iris.Sepal.Length iris.Sepal.Width iris.Petal.Length iris.Petal.Width
1          7.150000         3.120000          6.090000        2.1350000
2          6.315909         2.915909          5.059091        1.8000000
3          5.297674         3.115116          2.550000        0.6744186

$centers2
  iris.Sepal.Length iris.Sepal.Width iris.Petal.Length iris.Petal.Width
1          7.122727         3.113636          6.031818        2.1318182
2          6.123529         2.852941          4.741176        1.6132353
3          5.056667         3.268333          1.810000        0.3883333

$centers3
  iris.Sepal.Length iris.Sepal.Width iris.Petal.Length iris.Petal.Width
1          7.014815         3.096296          5.918519         2.155556
2          6.025714         2.805714          4.588571         1.518571
3          5.005660         3.369811          1.560377         0.290566

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM