简体   繁体   English

如何在 R 的 PCA 空间中突出显示特定变量或个体

[英]How to highlight a particular variable or individual in a PCA space in R

I am currently working on a large dataset (count data with species x samples) from which I performed a PCA.我目前正在研究一个大型数据集(计数数据与物种 x 样本),我从中执行了 PCA。 What I get is a massive cloud of points, and I would like to color one given species to show where it is located in this cloud (species are my variables here).我得到的是一大片点云,我想给一个给定的物种着色以显示它在这个云中的位置(物种是我的变量)。 Here is what it looks like :这是它的样子:

在此处输入图像描述

I use the package factoextra, and visualize the variables with fviz_pca_var.我使用包 factoextra,并使用 fviz_pca_var 可视化变量。 Is there a way to select one particular species and display it with a color different than the others ?有没有办法选择一个特定的物种并以不同于其他物种的颜色显示它?

Thank you for your help谢谢您的帮助

I would not label every data point.我不会标记每个数据点。 Just use a legend and highlight your species with, eg.只需使用图例并突出显示您的物种,例如。 red color, and all other species green.红色,所有其他物种绿色。

You did not provide example data, so I give you a solution with other sample data.您没有提供示例数据,所以我为您提供了其他示例数据的解决方案。 See the code below.请参阅下面的代码。 Using factoextra (and factominer) make your pca for all numerical columns.使用factoextra(和factominer)为所有数字列制作你的pca。 Then add a factor variable as a highlighter of your species when plotting the 2 dimensions of the PCA.然后在绘制 PCA 的 2 维时添加一个因子变量作为物种的荧光笔。 Just make a new factor var with a simple ifelse column to separate your species from the rest.只需使用简单的 ifelse 列创建一个新的因子 var 即可将您的物种与其他物种区分开来。 Use this factor column for highlighting in the fviz_pca_ind plot.使用此因子列在 fviz_pca_ind 图中突出显示。 See code below for an example:有关示例,请参见下面的代码:

library(FactoMineR)
library(ggplot2)
library(factoextra)

data("iris")
iris2 <- iris[1:4]
head(iris2)

# PCA analysis to get PCs
iris.pca <- PCA(iris2, scale.unit = TRUE, graph = FALSE)

# use Species from iris to change habillage
fviz_pca_ind(iris.pca, label="none", habillage = iris$Species)


library("FactoMineR")
res.pca <- PCA(df,  graph = FALSE)

iris$new_species <- as.factor(ifelse(iris$Species == "virginica", 
"my_species", "other_species"))

# Only highlight one species - rest black
fviz_pca_ind(iris.pca, label="none", habillage = 
iris$new_species)

在此处输入图像描述

If it's just a single point you want to color, perhaps:如果它只是你想要着色的一个点,也许:

library(tidyverse)
library(factoextra)
library(FactoMineR)

data("iris")

iris$assigned_colors <- NA
# Change the color of the 'individual of interest'
iris[9,]$assigned_colors <- "red"

iris.pca <- PCA(iris[,-c(5,6)], graph = FALSE)

fviz_pca_ind(iris.pca,
             geom = "point",
             geom.ind = "point") +
  geom_point(aes(color = iris$assigned_colors)) +
  scale_color_identity()
#> Warning: Removed 149 rows containing missing values (geom_point).

Created on 2022-07-08 by the reprex package (v2.0.1)reprex 包(v2.0.1)于 2022-07-08 创建

You can also label specific points (ie just the point of interest) using this approach , eg您还可以使用这种方法标记特定点(即只是兴趣点),例如

library(tidyverse)
library(factoextra)
library(FactoMineR)

data("iris")

iris$assigned_colors <- NA
iris[9,]$assigned_colors <- "red"

iris$labels <- NA
iris[9,]$labels <- "point of interest"

iris.pca <- PCA(iris[,-c(5,6, 7)], graph = FALSE)

fviz_pca_ind(iris.pca,
             geom = "point",
             geom.ind = "point") +
  geom_point(aes(color = iris$assigned_colors)) +
  geom_text(aes(label = iris$labels), nudge_y = -0.2) +
  scale_color_identity()
#> Warning: Removed 149 rows containing missing values (geom_point).
#> Warning: Removed 149 rows containing missing values (geom_text).

Created on 2022-07-08 by the reprex package (v2.0.1)reprex 包(v2.0.1)于 2022-07-08 创建

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM