everyone! I've been asked to create an K-means algorithm on R, but I don't really know the language, so I've found some example code on the internet, and decided to use. I've looked into it, learned the functions that are being used in it, and corrected it a bit, because it didn't work very well. Here's the code:
# Creating a sample of data
y=rnorm(500,1.65)
x=rnorm(500,1.15)
x=cbind(x,y)
centers <- x[sample(nrow(x),5),]
# A function for calculating the distance between centers and the rest of the dots
euclid <- function(points1, points2) {
distanceMatrix <- matrix(NA, nrow=dim(points1)[1], ncol=dim(points2)[1])
for(i in 1:nrow(points2)) {
distanceMatrix[,i] <- sqrt(rowSums(t(t(points1)-points2[i,])^2))
}
distanceMatrix
}
# A method function
K_means <- function(x, centers, euclid, nItter) {
clusterHistory <- vector(nItter, mode="list")
centerHistory <- vector(nItter, mode="list")
for(i in 1:nItter) {
distsToCenters <- euclid(x, centers)
clusters <- apply(distsToCenters, 1, which.min)
centers <- apply(x, 2, tapply, clusters, mean)
# Saving history
clusterHistory[[i]] <- clusters
centerHistory[[i]] <- centers
}
structure(list(clusters = clusterHistory, centers = centerHistory))
}
res <- K_means(x, centers, euclid, 5)
#To use the same plot operations I had to use unlist, since the resulting object in my function is a list of lists,
#and default object is just a list. And also i store the history of each iteration in that object.
res <- unlist(res, recursive = FALSE)
plot(x, col = res$clusters5)
points(res$centers5, col = 1:5, pch = 8, cex = 2)
It works fine on this simple matrix. But I've been asked to use it on iris:
head(iris)
a <-data.frame(iris$Sepal.Length, iris$Sepal.Width, iris$Petal.Length, iris$Petal.Width)
centers <- a[sample(nrow(a),3),]
iris_clusters <- K_means(a, centers, euclid, 3)
iris_clusters <- unlist(iris_clusters, recursive = FALSE)
head(iris_clusters)
And the problem is that it doesn't work. The error is:
Error in distanceMatrix[, i] <- sqrt(rowSums(t(t(points1) - points2[i, :
number of items to replace is not a multiple of replacement length
I understand that dimensions of objects don't match, but I don't understand why. That's why i'm asking for help. I apologize for all the stupidity there may be in this code in advance, but I'm not really familiar with the language yet, so don't judge me too harsh. Thank you!
Your implementation should work with simple typecasts
iris_clusters <- K_means(as.matrix(a), as.matrix(centers), euclid, 3) # 3 iterations
iris_clusters <- unlist(iris_clusters, recursive = FALSE)
# plotting the clusters obtained on the first two dimensions at the end of 3rd iteration
plot(a[,1:2], col = iris_clusters$clusters3, pch=19)
points(iris_clusters$centers3, col = 1:5, pch = 8, cex = 2)
head(iris_clusters)
# cluster assignments and centroids computed at different iterations
$clusters1
[1] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2 3 2 3 2 3 2 3 3 3 3 2 3 3 3 3 3 3 2 3 2 2 3 3
[77] 2 2 3 3 3 3 3 2 3 3 2 3 3 3 3 2 3 3 3 3 3 3 3 3 1 2 1 2 1 1 3 1 1 1 2 2 2 2 2 2 2 1 1 2 1 2 1 2 1 1 2 2 2 1 1 1 2 2 2 1 2 2 2 2 1 2 2 1 1 2 2 2 2 2
$clusters2
[1] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 3 2 3 3 2 2 2 3 2 2 2 2 3 2 2 2 2 2 2
[77] 2 2 2 3 3 3 2 2 2 2 2 2 2 2 2 2 2 3 2 2 2 2 3 2 1 2 1 2 1 1 2 1 1 1 2 2 1 2 2 2 2 1 1 2 1 2 1 2 1 1 2 2 2 1 1 1 2 2 2 1 2 2 2 1 1 2 2 1 1 2 2 2 2 2
$clusters3
[1] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[77] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 2 2 2 2 3 2 1 2 1 2 1 1 2 1 1 1 2 2 1 2 2 2 2 1 1 2 1 2 1 2 1 1 2 2 1 1 1 1 1 2 2 1 1 2 2 1 1 1 2 1 1 1 2 2 2 2
$centers1
iris.Sepal.Length iris.Sepal.Width iris.Petal.Length iris.Petal.Width
1 7.150000 3.120000 6.090000 2.1350000
2 6.315909 2.915909 5.059091 1.8000000
3 5.297674 3.115116 2.550000 0.6744186
$centers2
iris.Sepal.Length iris.Sepal.Width iris.Petal.Length iris.Petal.Width
1 7.122727 3.113636 6.031818 2.1318182
2 6.123529 2.852941 4.741176 1.6132353
3 5.056667 3.268333 1.810000 0.3883333
$centers3
iris.Sepal.Length iris.Sepal.Width iris.Petal.Length iris.Petal.Width
1 7.014815 3.096296 5.918519 2.155556
2 6.025714 2.805714 4.588571 1.518571
3 5.005660 3.369811 1.560377 0.290566
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.