简体   繁体   English

autoplot (ggplot) 如何从 prcomp 获取分数和载荷

[英]How autoplot (ggplot) gets scores and loadings from prcomp

I know that there are lots of discussions out there that treat this subject matter... but every time I encounter this, I never find a consistent, satisfying answer.我知道有很多关于这个主题的讨论……但每次遇到这个问题,我都找不到一致的、令人满意的答案。

I'm trying to create a very basic graphical depiction of a principal components analysis model. I always aim to not use packages that automatically generate plots for me because I want the control.我正在尝试创建主成分分析 model 的非常基本的图形描述。我一直致力于使用自动为我生成图的包,因为我想要控制。

Every time I try to make a PCA plot with loadings, I am stumped by how the canned functions relate their site-specific scores with the model's loading vectors.每次我尝试制作带有负载的 PCA plot 时,我都对固定函数如何将其特定于站点的分数与模型的加载向量相关联感到困惑。 This is despite the myriad nodes out there treating this matter, most of which just use the canned functions without explaining how the numbers got from a basic PCA model to the biplot (they just use the canned function).尽管有无数的节点在处理这个问题,但其中大部分只使用固定函数,而没有解释数字如何从基本 PCA model 到双标图(它们只使用固定函数)。

For the example code below, I'll use autoplot.对于下面的示例代码,我将使用自动绘图。 If I make a PCA model and use autoplot, I get a very cool graph.如果我制作 PCA model 并使用自动绘图,我会得到一个非常酷的图表。 But I want to know how they get these numbers- the scores get rescaled, and I have no idea how the vectors are relativized the way they are on the plot. Can anyone walk me through how I would get these numbers relativized data in dataframes of my own (both scores and vectors) so I can make the aesthetic changes I want without using autoplot??但我想知道他们是如何得到这些数字的——分数被重新调整了,我不知道向量是如何按照它们在 plot 上的方式相对化的。谁能告诉我如何在数据帧中获得这些数字相对化数据我自己的(分数和向量)所以我可以在不使用自动绘图的情况下进行我想要的美学改变?

d <- iris

m1 <- prcomp(d[,1:4], scale=T)

scores <- data.frame(m1$x[,1:2])

library(ggplot2)

#Scores range from about -2.5 to +3
ggplot(scores, aes(x=PC1, y=PC2))+
  geom_point()

#Scores range from about -0.15 to 0.22, no clue where the relativized loadings come from
autoplot(m1, loadings = T)

I'll attempt to walk you through and simplify the steps that autoplot uses to draw a PCA plot, so you can do this yourself quite easily in ggplot.我将尝试引导您完成并简化autoplot用于绘制 PCA plot 的步骤,因此您可以在 ggplot 中自己轻松完成此操作。

autoplot is actually an S3 generic function, so it's more accurate to talk about the method ggfortify:::autoplot.prcomp uses, since this is the function that is dispatched when you call autoplot on a prcomp object. autoplot实际上是一个 S3 泛型 function,因此更准确的说法是ggfortify:::autoplot.prcomp使用的方法,因为这是在 prcomp object 上调用autoplot时调度的prcomp

Let's start with your own example:让我们从您自己的示例开始:

library(ggfortify)
library(ggplot2)

d <- iris

m1 <- prcomp(d[, 1:4], scale = TRUE)

scores <- data.frame(m1$x[, 1:2])

The scores are normalized by dividing each column by its own root mean squared error通过将每列除以其自身的均方根误差来对分数进行归一化

scores[] <- lapply(scores, function(x) x / sqrt(sum((x - mean(x))^2)))

The loadings are simply obtained from the rotation member of the prcomp object:载荷简单地从prcomp object 的rotation构件获得:

loadings <- as.data.frame(m1$rotation)[1:2]

There is some internal scaling to ensure that the loadings appear on the same scale as the adjusted PC scores, but as far as I can tell this is simply for visual effect.有一些内部缩放以确保加载出现在与调整后的 PC 分数相同的比例上,但据我所知这只是为了视觉效果。 The scaling amounts to about 0.2 here, and is calculated as follows:这里的缩放比例约为 0.2,计算如下:

scale <- min(max(abs(scores$PC1))/max(abs(loadings$PC1)),
             max(abs(scores$PC2))/max(abs(loadings$PC2))) * 0.8

scale
#> [1] 0.1987812

We now have enough to recreate the autoplot using vanilla ggplot code.我们现在有足够的能力使用原始 ggplot 代码重新创建自动绘图。

ggplot(scores, aes(x = PC1, y = PC2))+
  geom_point() +
  geom_segment(data = loadings * scale,
               aes(x = 0, y = 0, xend = PC1, yend = PC2),
               color = "red", arrow = arrow(angle = 25, length = unit(4, "mm")))

在此处输入图像描述

Aside from the axis titles, this is identical to the autoplot:除了轴标题外,这与自动绘图相同:

autoplot(m1, loadings = TRUE)

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM