简体   繁体   English

在R中将pca应用于数据集后,减小数据集的维数

[英]Reduce the dimensions of a dataset after applying pca on it in R

My question is how to use the principal components obtained using R. 我的问题是如何使用通过R获得的主要成分。

Once you get the principal components, how do we use it to reduce the dimensions? 获得主要组件后,我们如何使用它来减小尺寸? I have a data_set containing 6 variables, I need to cluster it using k-means. 我有一个包含6个变量的data_set,我需要使用k-means对其进行聚类。 K-means gives me a scattered plot when I do the clustering on 6 variables. 当我对6个变量进行聚类时,K均值给我散布的图。 I thought pca could help to reduce the dimensions, and so k-means could produce fruitful results. 我认为pca可以帮助减小尺寸,因此k均值可以产生丰硕的成果。

I did this to get the principal components: 我这样做是为了获得主要成分:

pca1 <- prcomp(data_set)

Please guide me as to how to proceed further to reduce the dimensionality of the data set. 请指导我如何进一步减少数据集的维数。

you can find the values you get from a function if you type for example ?prcomp this is what i used to do using another package: 您可以找到从函数中获取的值,如果您键入例如?prcomp,这就是我以前使用另一个包所做的事情:

library("FactoMineR")

pca <- PCA(dataset, scale.unit=TRUE, graph=FALSE)

scores <- data.frame(pca$ind$coord)

library(ggplot2)

ggplot(scores,aes(Dim.1,Dim.2)) + geom_text(label=rownames(scores),colour="red") + geom_hline(yintercept=0) + geom_vline(xintercept=0) + labs(title="Score plot")

you get the plot for the scores according to PC1 and PC2, and the same if you want the loadings plot 您将获得根据PC1和PC2得出的得分图,如果您希望获得载荷图,则得出相同的结果

loadings <- data.frame(pca$var$coord)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM