简体   繁体   English

R中的glm()与Excel中的逻辑回归的手动实现之间的结果不一致

[英]Inconsistent results between glm() in R and manual implementation of logistic regression in Excel

You'll find a manual implementation of logistic regression in Excel at: http://blog.excelmasterseries.com/2014/06/logistic-regression-performed-in-excel.html . 您可以在以下网址找到Excel中的逻辑回归的手动实现: http : //blog.excelmasterseries.com/2014/06/logistic-regression-performed-in-excel.html

This implementation uses the dataset below and reports the following coefficients 此实现使用下面的数据集并报告以下系数

b0 = 12.48285608 b0 = 12.48285608

b1 = -0.117031374 b1 = -0.117031374

b2 = -1.469140055 b2 = -1.469140055

However, when I analyze the same dataset with glm() in R , the results are not the same, ie: 但是,当我使用R中的 glm()分析相同的数据集时,结果是不相同的,即:

b0 = 1.687445 b0 = 1.687445

b1 = -0.012525 b1 = -0.012525

b2 = -0.116473 b2 = -0.116473

d <- structure(list(Y = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), X1 = c(78L, 73L, 73L, 
71L, 68L, 59L, 57L, 49L, 35L, 27L, 59L, 57L, 44L, 38L, 36L, 36L, 
22L, 22L, 15L, 10L), X2 = c(8L, 8L, 5L, 7L, 5L, 4L, 7L, 5L, 4L, 
7L, 3L, 4L, 5L, 5L, 4L, 2L, 6L, 5L, 4L, 6L)), .Names = c("Y", 
"X1", "X2"), class = "data.frame", row.names = c(NA, -20L))  

summary(glm(Y ~ X1+X2, data=d), family=binomial(link='logit'))


# > summary(glm(Y ~ X1+X2, data=d), family=binomial(link='logit'))
# 
# Call:
#   glm(formula = Y ~ X1 + X2, data = d)
# 
# Deviance Residuals: 
#   Min        1Q    Median        3Q       Max  
# -0.78318  -0.20641   0.07689   0.24375   0.49237  
# 
# Coefficients:
#   Estimate Std. Error t value Pr(>|t|)    
# (Intercept)  1.687445   0.319872   5.275 6.18e-05 ***
#   X1          -0.012525   0.004376  -2.862   0.0108 *  
#   X2          -0.116473   0.056959  -2.045   0.0567 .  
# ---
#   Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# 
# (Dispersion parameter for gaussian family taken to be 0.146843)
# 
# Null deviance: 5.0000  on 19  degrees of freedom
# Residual deviance: 2.4963  on 17  degrees of freedom
# AIC: 23.139
# 
# Number of Fisher Scoring iterations: 2

Why do the results differ? 为什么结果不同?

You have the family parameter in the wrong place. 您的家庭参数放置在错误的位置。 It should be in the glm() call, not the summary() call. 它应该在glm()调用中,而不是summary()调用中。

summary(glm(Y ~ X1+X2, data=d, family=binomial(link='logit')))

If you don't include the family in the glm() , it will do a gaussian (linear) regression. 如果不将族包含在glm() ,它将进行高斯(线性)回归。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM