简体   繁体   English

R中的多重相关系数

[英]Multiple correlation coefficient in R

I am looking for a way to calculate the multiple correlation coefficient in R http://en.wikipedia.org/wiki/Multiple_correlation , is there a built-in function to calculate it ? 我正在寻找一种方法来计算R http://en.wikipedia.org/wiki/Multiple_correlation中的多重相关系数,是否有内置函数来计算它? I have one dependent variable and three independent ones. 我有一个因变量和三个独立变量。 I am not able to find it online, any idea ? 我无法在网上找到它,任何想法?

The built-in function lm gives at least one version, not sure if this is what you are looking for: 内置函数lm提供至少一个版本,不确定这是否是您正在寻找的:

fit <- lm(yield ~ N + P + K, data = npk)
summary(fit)

Gives: 得到:

Call:
lm(formula = yield ~ N + P + K, data = npk)

Residuals:
    Min      1Q  Median      3Q     Max 
-9.2667 -3.6542  0.7083  3.4792  9.3333 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   54.650      2.205  24.784   <2e-16 ***
N1             5.617      2.205   2.547   0.0192 *  
P1            -1.183      2.205  -0.537   0.5974    
K1            -3.983      2.205  -1.806   0.0859 .  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 5.401 on 20 degrees of freedom
Multiple R-squared:  0.3342,    Adjusted R-squared:  0.2343 
F-statistic: 3.346 on 3 and 20 DF,  p-value: 0.0397

More info on what's going on at ?summary.lm and ?lm . 关于正在发生什么的更多信息?summary.lm?lm

Try this: 尝试这个:

# load sample data 
data(mtcars)

# calculate correlation coefficient between all variables in `mtcars` using 
# the inbulit function

M <- cor(mtcars)

# M is a matrix of correlation coefficient which you can display just by  
# running 

print(M)

# If you want to plot the correlation coefficient 

library(corrplot)
corrplot(M, method="number",type= "lower",insig = "blank", number.cex = 0.6)

The easiest way to calculate the multiple correlation coefficient (ie the correlation between two or more variables on the one hand, and one variable on the other) is to create a multiple linear regression (predicting the values of one variable treated as dependent from the values of two or more variables treated as independent) and then calculate the coefficient of correlation between the predicted and observed values of the dependent variable. 计算多重相关系数的最简单方法(即一方面两个或多个变量之间的相关性,另一方面另一个变量之间的相关性)是创建一个多元线性回归(预测一个变量的值被视为依赖于值将两个或多个变量视为独立的,然后计算因变量的预测值和观测值之间的相关系数。

Here, for example, we create a linear model called mpg.model , with mpg as the dependent variable and wt and cyl as the independent variables, using the built-in mtcars dataset: 例如,我们使用内置的mtcars数据集创建一个名为mpg.model的线性模型,其中mpg作为因变量, wtcyl作为自变量:

> mpg.model <- lm(mpg ~ wt + cyl, data = mtcars)

Having created the above model, we correlate the observed values of mpg (which are embedded in the object, within the model data frame) with the predicted values for the same variable (also embedded): 创建上述模型后,我们将mpg的观察值(嵌入在对象中,在model数据框内)与相同变量(也嵌入)的预测值相关联:

> cor(mpg.model$model$mpg, mpg.model$fitted.values)
[1] 0.9111681

R will in fact do this calculation for you, but without telling you so, when you ask it to create the summary of a model (as in Brian's answer): the summary of an lm object contains R-squared, which is the square of the coefficient of correlation. R实际上会为你做这个计算,但没有告诉你,当你要求它创建一个模型的摘要时(如Brian的答案): lm对象的摘要包含R平方,这是方形的相关系数。 So an alternative way to get the same result is to extract R-squared from the summary.lm object and take the square root of it, thus: 因此,获得相同结果的另一种方法是从summary.lm对象中提取R平方并获取它的平方根,因此:

> sqrt(summary(mpg.model)$r.squared)
[1] 0.9111681

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM