[英]How to label specific data points on a PCA plot in r using ggplot
enter image description here在此处输入图像描述
I want to pick out 5 specific IDs and add labels to them so I can see where they are located on the PCA plot.我想挑选 5 个特定的 ID 并为它们添加标签,以便我可以看到它们在 PCA 图上的位置。 I have used library(tidyverse. thank you我用过图书馆(tidyverse。谢谢
Without a minimal reproducible dataset it's difficult to know whether this approach will suit your purposes, but perhaps:如果没有最小的可重现数据集,很难知道这种方法是否适合您的目的,但也许:
install.packages("tidyverse")
install.packages("factoextra")
install.packages("FactoMineR")
library(tidyverse)
library(factoextra)
library(FactoMineR)
data("iris")
# Create a 'label' for every point (NA)
iris$label <- NA
head(iris)
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species label
#> 1 5.1 3.5 1.4 0.2 setosa NA
#> 2 4.9 3.0 1.4 0.2 setosa NA
#> 3 4.7 3.2 1.3 0.2 setosa NA
#> 4 4.6 3.1 1.5 0.2 setosa NA
#> 5 5.0 3.6 1.4 0.2 setosa NA
#> 6 5.4 3.9 1.7 0.4 setosa NA
# Then 'relabel' the points of interest
iris[2,]$label <- "Label_1"
iris[66,]$label <- "Label_2"
iris[144,]$label <- "Label_3"
head(iris)
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species label
#> 1 5.1 3.5 1.4 0.2 setosa <NA>
#> 2 4.9 3.0 1.4 0.2 setosa Label_1
#> 3 4.7 3.2 1.3 0.2 setosa <NA>
#> 4 4.6 3.1 1.5 0.2 setosa <NA>
#> 5 5.0 3.6 1.4 0.2 setosa <NA>
#> 6 5.4 3.9 1.7 0.4 setosa <NA>
# Remove species column (5) and label column and scale the data
iris.pca <- PCA(iris[,-c(5,6)], graph = FALSE)
fviz_pca_ind(iris.pca,
geom.ind = "point", # show points only (nbut not "text")
col.ind = iris$Species, # color by groups
palette = c("#00AFBB", "#E7B800", "#FC4E07"),
addEllipses = TRUE, # Concentration ellipses
legend.title = "Groups") +
geom_text(aes(label = iris$label))
#> Warning: Removed 147 rows containing missing values (geom_text).
# You can nudge the labels left/right or up/down
fviz_pca_ind(iris.pca,
geom.ind = "point", # show points only (nbut not "text")
col.ind = iris$Species, # color by groups
palette = c("#00AFBB", "#E7B800", "#FC4E07"),
addEllipses = TRUE, # Concentration ellipses
legend.title = "Groups") +
geom_text(aes(label = iris$label), nudge_x = 0.5)
#> Warning: Removed 147 rows containing missing values (geom_text).
# Or you can use ggrepel
library(ggrepel)
fviz_pca_ind(iris.pca,
geom.ind = "point", # show points only (nbut not "text")
col.ind = iris$Species, # color by groups
palette = c("#00AFBB", "#E7B800", "#FC4E07"),
addEllipses = TRUE, # Concentration ellipses
legend.title = "Groups") +
geom_text_repel(aes(label = iris$label),
box.padding = 5)
#> Warning: Removed 147 rows containing missing values (geom_text_repel).
Created on 2022-07-06 by the reprex package (v2.0.1)由reprex 包(v2.0.1) 创建于 2022-07-06
NB The warning "#> Warning: Removed 147 rows containing missing values (geom_text)."注意警告“#> 警告:删除了 147 行包含缺失值 (geom_text)。” relates to the NA's being removed and you can safely ignore it.与 NA 被删除有关,您可以放心地忽略它。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.