简体   繁体   中英

PCA biplot group individuals

I have many individuals in my data (n=600). I run a PCA and would like to create a Biplot of variables and individuals. I'd like the variables coloured by their contribution. These individuals come from two groups and I'd like to colour the points according to the two groups. I attach a small example.

library(FactoMineR)
library(factoextra)
data(decathlon2)
decathlon2.active <- decathlon2[1:23, 1:10]
head(decathlon2.active[, 1:6])
res.pca <- PCA(decathlon2.active, graph = FALSE)
fviz_pca_biplot(res.pca, col.var="cos", geom = "point") + scale_color_gradient2(low="white", mid="blue", 
                    high="red", midpoint=0.5) + theme_minimal()

res.pca_ind = data.frame(res.pca$ind)
res.pca_ind

Question;

  1. How could I colour the rownames SEBRLE & NOOL in red and the rest in black
  2. Assign all the rownames to 1 of 2 factors (I don't mind whichever as this is an example) and colour them differently.

Partial answer;

sub = as.character(rownames(res.pca_ind))
decathlon3 = decathlon2[which(rownames(decathlon2) %in% sub),]

fviz_pca_biplot(res.pca, axes = c(1, 2), geom = c("point", "text"),
            label = "all", invisible = "none", labelsize = 2, pointsize = 2,
            habillage = decathlon3$Competition, addEllipses = FALSE,    ellipse.level = 0.95,
            col.ind = "black", col.ind.sup = "blue", alpha.ind = 1,
            col.var = "steelblue", alpha.var = 1, col.quanti.sup = "blue",
            col.circle = NULL, 
            select.var = list(name = NULL, cos2 = NULL, contrib= NULL), 
            select.ind = list(name = NULL, cos2 = NULL, contrib = NULL),
            jitter = list(what = "label", width = NULL, height = NULL))

However, where I gain I'm losing. I was not able to find way to use both habillage and the select.var by contrib as Error: Continuous value supplied to discrete scale kept appearing.

If you don't need a legend:

library(factoextra)

point_col = ifelse(rownames(decathlon2.active) %in% c("SEBRLE","NOOL"),
"red","black")

res.pca <- PCA(decathlon2.active, graph = FALSE)
g = fviz_pca_biplot(res.pca, col.var="contrib", geom = "") +
scale_color_gradient2(low="white", mid="blue", high="red", midpoint=0.5) + 
theme_minimal() 
g + geom_point(col = point_col)

在此处输入图片说明

If you need a legend, I name the observations containing sebrle and nool as 0, others as 1:

library(ggnewscale)

g$data$group = +(rownames(decathlon2.active) %in% c("SEBRLE","NOOL"))
g + new_scale_color() + 
geom_point(aes(col=factor(group))) + 
scale_color_manual(values = c("black","red"))

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM