逻辑回归的 glm() 结果

Question

This might be a trivial question but I don't know where to find answers.这可能是一个微不足道的问题，但我不知道在哪里可以找到答案。 I'm wondering when using glm() for logistic regression in R, if the response variable Y has factor values 1 or 2, does the result of glm() correspond to logit(P(Y=1)) or logit(P(Y=2)) ?我想知道在 R 中使用glm()进行逻辑回归时，如果响应变量Y的因子值为 1 或 2， glm()的结果是否对应于logit(P(Y=1))或logit(P(Y=2)) ? What if Y has logical values TRUE or FALSE ?如果Y具有逻辑值TRUE或FALSE怎么办？

Answer 1

Why not just test it yourself?为什么不自己测试呢？

output_bool <- c(rep(c(TRUE, FALSE), c(25, 75)), rep(c(TRUE, FALSE), c(75, 25)))
output_num <- c(rep(c(2, 1), c(25, 75)), rep(c(2, 1), c(75, 25)))
output_fact <- factor(output_num)
var <- rep(c("unlikely", "likely"), each = 100)

glm(output_bool ~ var, binomial)
#> 
#> Call:  glm(formula = output_bool ~ var, family = binomial)
#> 
#> Coefficients:
#> (Intercept)  varunlikely  
#>       1.099       -2.197  
#> 
#> Degrees of Freedom: 199 Total (i.e. Null);  198 Residual
#> Null Deviance:       277.3 
#> Residual Deviance: 224.9     AIC: 228.9
glm(output_num ~ var, binomial)
#> Error in eval(family$initialize): y values must be 0 <= y <= 1
glm(output_fact ~ var, binomial)
#> 
#> Call:  glm(formula = output_fact ~ var, family = binomial)
#> 
#> Coefficients:
#> (Intercept)  varunlikely  
#>       1.099       -2.197  
#> 
#> Degrees of Freedom: 199 Total (i.e. Null);  198 Residual
#> Null Deviance:       277.3 
#> Residual Deviance: 224.9     AIC: 228.9

So, we get the correct answer if we use TRUE and FALSE, an error if we use 1 and 2 as numbers, and the correct result if we use 1 and 2 as a factor with two levels provided the TRUE value has a higher factor level than the FALSE.因此，如果我们使用 TRUE 和 FALSE，我们会得到正确的答案，如果我们使用 1 和 2 作为数字，则会得到错误，如果我们使用 1 和 2 作为具有两个水平的因子，则如果 TRUE 值具有更高的因子水平，则会得到正确的结果比错误。 However, we have to be careful in how our factors are ordered or we will get the wrong result:但是，我们必须小心我们的因子是如何排序的，否则我们会得到错误的结果：

output_fact <- factor(output_fact, levels = c("2", "1"))
glm(output_fact ~ var, binomial)
#> 
#> Call:  glm(formula = output_fact ~ var, family = binomial)
#> 
#> Coefficients:
#> (Intercept)  varunlikely  
#>      -1.099        2.197  
#> 
#> Degrees of Freedom: 199 Total (i.e. Null);  198 Residual
#> Null Deviance:       277.3 
#> Residual Deviance: 224.9     AIC: 228.9

(Notice the intercept and coefficient have flipped signs) （注意截距和系数有翻转符号）

^{Created on 2020-06-21 by the reprex package (v0.3.0)}^{由代表 package (v0.3.0) 于 2020 年 6 月 21 日创建}

Answer 2

Testing is good.测试很好。 If you want the documentation, it's in ?binomial (which is the same as ?family ):如果您想要文档，它位于?binomial中（与?family相同）：

For the 'binomial' and 'quasibinomial' families the response can be specified in one of three ways:对于“二项式”和“准二项式”系列，可以通过以下三种方式之一指定响应：

As a factor: 'success' is interpreted as the factor not having the first level (and hence usually of having the second level).作为一个因素：“成功”被解释为不具有第一级的因素（因此通常具有第二级）。

As a numerical vector with values between '0' and '1', interpreted as the proportion of successful cases (with the total number of cases given by the 'weights').作为一个数值向量，其值介于“0”和“1”之间，解释为成功案例的比例（由“权重”给出的案例总数）。

As a two-column integer matrix: the first column gives the number of successes and the second the number of failures.作为一个双列 integer 矩阵：第一列给出成功的次数，第二列给出失败的次数。

It doesn't explicitly say what happens in the logical ( TRUE / FALSE ) case;它没有明确说明逻辑（ TRUE / FALSE ）情况下会发生什么； for that you have to know that, when coercing logical to numeric values, FALSE → 0 and TRUE → 1.为此，您必须知道，在将逻辑强制转换为数值时， FALSE → 0 和TRUE → 1。

逻辑回归的 glm() 结果

问题描述

2 个解决方案

解决方案1
3 已采纳 2020-06-21 17:48:01

解决方案2
3 2020-06-21 17:54:49

逻辑回归的 glm() 结果

问题描述

2 个解决方案

解决方案1 3 已采纳 2020-06-21 17:48:01

解决方案2 3 2020-06-21 17:54:49

解决方案1
3 已采纳 2020-06-21 17:48:01

解决方案2
3 2020-06-21 17:54:49