# k-均值聚类-为什么所有聚类都相同？k-means clustering— why all same clusters?

cluster1：预订航班NA

cluster2：航班预订不适用

cluster3：航班预订不适用

cluster4：航班预订不适用

cluster5：预订航班NA

``````myCorpus<-Corpus(VectorSource(myCorpus\$text))
myCorpusCopy<-myCorpus
myCorpus<-tm_map(myCorpus,stemDocument)
myCorpus<-tm_map(myCorpus,stemCompletion,dictionary=myCorpusCopy)
myTdm<-TermDocumentMatrix(myCorpus,control=list(wordLengths=c(1,Inf)))
myTdm2<-removeSparseTerms(myTdm,sparse=0.95)
m2<-as.matrix(myTdm2)
m3<-t(m2)
set.seed(122)
k<-5
kmeansResult<-kmeans(m3,k)
round(kmeansResult\$centers,digits=3)

for(i in 1:k){
cat(paste("cluster",i,":",sep=""))
s<-sort(kmeansResult\$centers[i,],decreasing=T)
cat(names(s)[1:3],"\n")
}
``````

## 1 个回复1

### ===============>>#1 票数：0

``````# install.packages("NbClust")
library(NbClust)
set.seed(1234)
df <- rbind(matrix(rnorm(100,sd=0.1),ncol=2),
matrix(rnorm(100,mean=1,sd=0.2),ncol=2),
matrix(rnorm(100,mean=5,sd=0.1),ncol=2),
matrix(rnorm(100,mean=7,sd=0.2),ncol=2))

# "scree" plots on appropriate number of clusters (you should look
# for a bend in the graph)
nc <- NbClust(df, min.nc=2, max.nc=20, method="kmeans")
table(nc\$Best.n[1,])

# creating a bar chart to visualize results on appropriate number
# of clusters
barplot(table(nc\$Best.n[1,]),
xlab="Number of Clusters", ylab="Number of Criteria",
main="Number of Clusters Chosen by Criteria")
``````

1回复

1回复

3回复

1回复

1回复

1回复

1回复

2回复

1回复

1回复