简体   繁体   English

如何解释R kmeans函数的结果?

[英]How can I interpret the results of R kmeans function?

I have a large set of data containing the description for 81432 images. 我有大量的数据,其中包含针对81432张图像的描述。 These descriptions are generated by an image descriptor which generates a vector (for each image) with 127 positions. 这些描述由图像描述符生成,图像描述符生成具有127个位置的矢量(针对每个图像)。 So, I have a matrix with 81432 rows and 127 columns. 因此,我有一个包含81432行和127列的矩阵。

And I'm running kmeans from R, but I just don't know how to interpret the results. 而且我正在从R运行kmeans ,但我只是不知道如何解释结果。 I've set a number of clusters, the algorithm runs and so what? 我设置了多个群集,算法运行了,怎么办? I want to plot the Elbow rule, but I don't even know how to do it. 我想绘制肘部规则,但我什至不知道该怎么做。

An example code snippet using Kmeans and Principal Component Analysis for analyzing and visualizing datasets : 使用Kmeans和主成分分析来分析和可视化数据集的示例代码片段:

library(calibrate)
library(plyr)
library(gclus)
library(scatterplot3d)
library(cluster)
library(fpc)
library(mclust)
library(rpanel)
library(rgl)
library(lattice)
library(tm);
library(RColorBrewer) 



#Read data
mydata <- read.table(file="c:/data.mtx", header=TRUE, row.names=1, sep="");

# Lets look at the correlations
mydata.cor = abs(cor(scale(mydata)))
mydata.cor[,1:2]

#lets look at the data in interactive 3D plot before PCA
rp.plot3d(mydata[,1],mydata[,2], mydata[,3])

# Doing the PCA 
mydata.pca<- prcomp(mydata, retx=TRUE, center=TRUE, scale=TRUE);
summary(mydata.pca)
#3D plot of first three PCs
rp.plot3d(mydata.pca$x[,1],mydata.pca$x[,2],mydata.pca$x[,3])


#Eigenvalues of components for Kaiser Criterion
mydata.pca$sdev ^2


#scree test for determining optimal number of PCs (Elbow rule)
par(mfrow=c(1,2))
screeplot(mydata.pca,main="Scree Plot",xlab="Components")
screeplot(mydata.pca,type="line",main="Scree Plot")

#Scores
scores = mydata.pca$x
##  Plot of the scores, with the axes
pdf("scores.pdf")
plot (scores[,1], scores[,2], xlab="Scores 1", ylab="Scores 2")
text (x=scores[,1], y=scores[,2], labels=row.names (scores), cex=c(0.4,0.4), col = "blue")
lines(c(-5,5),c(0,0),lty=2)  ##  Draw the horizontal axis
lines(c(0,0),c(-4,3),lty=2)  ##  Draw the vertical axis
dev.off() 

#finding possible number of clusters in Kmeans
wss <- (nrow(scale(mydata))-1)*sum(apply(scale(mydata),2,var)); 
for (i in 2:20) wss[i] <- sum(kmeans(scale(mydata),centers=i)$withinss);
plot(1:20, wss, type="b", xlab="Number of Clusters",  ylab="Within groups sum of squares");

#Performing K-Means and visualizing the result
km1<-kmeans(scores[,1:2], algorithm = "Hartigan-Wong", centers=4)   
#par(mfrow = c(1, 1))
pdf("km.pdf")
plot(scores[,1:2], col = km1$cluster);
points(km1$centers, col = 1:5, pch = 8, cex=2);
scatterplot3d(km1$centers, pch=20, highlight.3d = TRUE, type="h");
# getting cluster means 
aggregate(scores[,1:2],by=list(km1$cluster),FUN=mean);
# appending cluster assignment
clustercounts <- data.frame(scores[,1:2], km1$cluster);
#Cluster Plot against 1st 2 principal components
clusplot(scores[,1:2], km1$cluster, color=TRUE, shade=TRUE, labels=2, lines=0, cex=c(0.2,0.2));
dev.off()

To plot the Elbow Rule (which is about how near are the points to its centroid) we have to use the tot.withinss (Total within-cluster sum of squares). 要绘制肘部规则(关于它的质心点的距离),我们必须使用tot.withinss (簇内总平方和)。

This answer is regarding the use of R. 这个答案是关于R的使用的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 我应该如何解释 R 中函数 multinom 的结果? - How should I interpret the results of function multinom in R? 我如何从 R 中的聚合函数解释这个错误 - How do I interpret this error from aggregate function in R 如何在不返回函数的情况下将结果保存在R中? - How can I save the results of a function in R without returning it? 如何在用 kmeans 获得的集群的 R 中制作 3D 图? - How can I make a 3D plot in R of the clusters obtained with kmeans? 如何解释 R 中 Multinom() Function 的系数表 - How to Interpret a Coefficient table for Multinom() Function in R 我可以将汇总函数的结果存储在R中吗? - Can I store the results of the summary function in R? 有没有人可以使用 R 解释有关基本正则表达式的结果? - Is there anyone can interpret the results about basic regular expression using R? 我可以强制knitr将“\\ n”解释为传递给R函数的字符串中的实际换行符吗? - Can I force knitr to interpret “\n” as an actual line feed in a string passed to an R function? 在R中的线性模型中检查共线性时,如何解释别名结果? - How to interpret alias results while checking for collinearity in a Linear model in R? 我可以对kmeans功能使用自定义距离度量吗? - Can I use a custom distance measure for kmeans function?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM