[英]Least-squares fit of PCA scores on original variables
I have 100 vars, and I want to do factor analysis using variables var15-v25. 我有100个变量,我想使用变量var15-v25进行因子分析。 To do that first I extracted the variables into another object (say
f
), & then run the principal component analysis. 为此,我首先将变量提取到另一个对象(比如
f
),然后运行主成分分析。
Now I want to merge PCA scores with the original dataset to run regression using PCA scores as predictors. 现在,我想将PCA分数与原始数据集合并,以使用PCA分数作为预测变量运行回归。
Can anybody please suggest me the method to merge these 2 datasets. 任何人都可以建议我合并这两个数据集的方法。 The code I used are the following:
我使用的代码如下:
spss_data_factor <- sqldf("SELECT Respondent_Serial,Q4_01_Q4,Q4_02_Q4,Q4_03_Q4,Q4_04_Q4,Q4_05_Q4,Q4_06_Q4,Q4_07_Q4,Q4_08_Q4,Q4_09_Q4,Q4_10_Q4 FROM spss_data_rel")
f <- princomp(spss_data_factor1, cor = TRUE)
summary(f, loadings=TRUE)
f$scores[, 1:5]
Please avoid using names from R base packages - factor
is kind of reserved. 请避免使用R base包中的名称 -
factor
是保留的。 It will work just fine, but it may confuse you at some point of development... And your factor
is not a file, it's a R object of princomp
class. 它会工作得很好,但它可能会使你在某些开发点上感到困惑......而且你的
factor
不是文件,它是princomp
类的R对象。
Anyway, you want to define a regression model with factor scores as predictors? 无论如何,您想要将因子得分定义为预测变量的回归模型? Piece of cake... and no merging is required:
一块蛋糕......并且不需要合并:
fa <- princomp(mtcars, cor=TRUE)
fa_scores <- fa$scores
fit <- lm(mtcars$hp ~ fa_scores)
summary(fit)
Call:
lm(formula = mtcars$hp ~ fa_scores)
Residuals:
Min 1Q Median 3Q Max
-2.521e-14 -7.825e-15 -2.416e-15 5.622e-15 4.329e-14
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.467e+02 2.862e-15 5.125e+16 <2e-16 ***
fa_scoresComp.1 -2.227e+01 1.113e-15 -2.000e+16 <2e-16 ***
fa_scoresComp.2 -1.679e+01 1.758e-15 -9.549e+15 <2e-16 ***
fa_scoresComp.3 9.449e+00 3.614e-15 2.614e+15 <2e-16 ***
fa_scoresComp.4 -4.567e+00 5.513e-15 -8.285e+14 <2e-16 ***
fa_scoresComp.5 -3.644e+01 6.055e-15 -6.019e+15 <2e-16 ***
fa_scoresComp.6 -4.821e+00 6.222e-15 -7.747e+14 <2e-16 ***
fa_scoresComp.7 -1.010e-01 7.783e-15 -1.297e+13 <2e-16 ***
fa_scoresComp.8 1.501e+01 8.164e-15 1.838e+15 <2e-16 ***
fa_scoresComp.9 -3.886e+01 1.031e-14 -3.768e+15 <2e-16 ***
fa_scoresComp.10 1.672e+01 1.255e-14 1.333e+15 <2e-16 ***
fa_scoresComp.11 -1.731e+01 1.928e-14 -8.979e+14 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.619e-14 on 20 degrees of freedom
Multiple R-squared: 1, Adjusted R-squared: 1
F-statistic: 5.053e+31 on 11 and 20 DF, p-value: < 2.2e-16
You may also want to convert original dataset to matrix, in order to carry out ncol(mtcars)
regressions, on each column of response matrix. 您可能还希望将原始数据集转换为矩阵,以便在每个响应矩阵列上执行
ncol(mtcars)
回归。 lm
function supports response ~ terms
formula, where response
can be a matrix. lm
函数支持response ~ terms
公式,其中response
可以是矩阵。 See ?lm
: 见
?lm
:
If response is a matrix a linear model is fitted separately by least-squares to each column of the matrix.
如果响应是矩阵,则线性模型通过最小二乘法分别拟合到矩阵的每列。
So, you can do something like this: 所以,你可以这样做:
fit2 <- lm(as.matrix(mtcars) ~ fa_scores)
summary(fit2) # handle with care! =)
I hope that this was helpful... 我希望这有用......
Anyway, if you want to perform a factor analysis, please see this link . 无论如何,如果您想进行因子分析,请参阅此链接 。 You should install William Revelle's
psych
package. 你应该安装William Revelle的
psych
包。
Thank you aL3xa! 谢谢aL3xa! I found the answer of the solution.
我找到了解决方案的答案。 I'm putting it here as somebody might find it helpful.
我把它放在这里,因为有人可能会发现它有用。
## Factor Analysis
library(psych)
spss_data_fac=read.csv("D:\\Arijit\\spss_data_rel_01.csv")
fa.parallel(spss_data_fac[,40:49])
spss_data_fac_01=factanal(spss_data_fac[,40:49],factors=2,scores="regression",rotation="promax")
spss_data_fac_01$scores
## Factor Analysis factors are used for logistic regression
spss_dat_reg=glm(spss_data_fac$Q8~spss_data_fac_01$scores+spss_data_fac$Q14)
summary(spss_dat_reg)
Regards, A 问候,A
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.