PCA and Hotelling's T^2 for confidence intervall in R

Question

I made a principal component analysis and took the 2 first principal components. I made a chart of my points based on the score of the 2 PC. I would like to add on this graph a 95% confidence region corresponding to the Hotelling's T^2 test in order to detect the points that are out of the ellipse (outliers) How is it possible in R? Do you have any example?

I would do something like this and detect the points out of the ellipse:

Answer 1

We can plot the confidence ellipse for PCA with vegan or ggbiplot as below:

set.seed(1)
data <- matrix(rnorm(500), ncol=5) # some random data
data <- setNames(as.data.frame(rbind(data, matrix(runif(25, 5, 10), ncol=5))), LETTERS[1:5]) # add some outliers
class <- sample(c(0,3,6,8), 105, replace=TRUE) # 4 groups

library(vegan)
PC <- rda(data, scale=TRUE)
pca_scores <- scores(PC, choices=c(1,2))
plot(pca_scores$sites[,1], pca_scores$sites[,2],
     pch=class, col=class, xlim=c(-2,2), ylim=c(-2,2))
arrows(0,0,pca_scores$species[,1],pca_scores$species[,2],lwd=1,length=0.2)
ordiellipse(PC,class,conf=0.95)

library(ggbiplot)
PC <- prcomp(data, scale = TRUE)
ggbiplot(PC, obs.scale = 1, var.scale = 1, groups = as.factor(class), ellipse = TRUE, 
                                                    ellipse.prob = 0.95)

Answer 2

The pcaMethods package has a function simpleEllipse(x, y, alpha, len) that will do this. Given two uncorrelated data vectors it will return an ellipse, where the axes are scaled based on the variance of each score, and the F statistic.

PCA and Hotelling's T^2 for confidence intervall in R

Question

2 answers

solution1
0 2017-03-07 13:39:26

solution2
0 2019-05-07 17:34:08

PCA and Hotelling's T^2 for confidence intervall in R

Question

2 answers

solution1 0 2017-03-07 13:39:26

solution2 0 2019-05-07 17:34:08

solution1
0 2017-03-07 13:39:26

solution2
0 2019-05-07 17:34:08