简体   繁体   中英

Principal component analysis on a correlation matrix

Many functions can perform Principal Component Analysis (PCA) on raw data in R. By raw data I understand any data frame or matrix whose rows are indexed by observations and whose columns are identified with measurements. Can we carry out PCA on a correlation matrix in R ? Which function can accept a correlation matrix as its input in R ?

As mentioned in the comments, you can use

ii <- as.matrix(iris[,1:4])
princomp(covmat=cor(ii))

This will give you equivalent results to princomp(iris,cor=TRUE) (which is not what you want - the latter uses the full data matrix, but returns the value computed when the covariance matrix is converted to a correlation).


You can also do all the relevant computations by hand if you have the correlation matrix:

cc <- cor(ii)
e1 <- eigen(cc)

Standard deviations:

sqrt(e1$values)
[1] 1.7083611 0.9560494 0.3830886 0.1439265

Proportion of variance:

e1$values/sum(e1$values)
[1] 0.729624454 0.228507618 0.036689219 0.005178709

You can get the loadings via e1$vectors . Compute the scores (according to this CV question ) via as.matrix(iris) %*% e1$vectors) (this will not give numerically identical answers to princomp()$scores - the eigenvectors are scaled differently - but it gives equivalent results).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM