简体   繁体   English

逻辑回归的 GLM 函数:默认的预测结果是什么?

[英]GLM function for Logistic Regression: what is the default predicted outcome?

I am relatively new to R modelling and I came across the GLM functions for modelling.我对 R 建模比较陌生,并且遇到了用于建模的 GLM 函数。 I am interested in Logistic regression using the family 'binomial'.我对使用家庭“二项式”的逻辑回归感兴趣。 My question is when my dependent variable can take one of two possible outcomes - say 'positive', 'negative' - what is the default outcome for which the estimates are computed - does the model predict the log odds for a 'positive' or a 'negative' outcome by default ?我的问题是,当我的因变量可以采用两种可能的结果之一时——比如“正”、“负”——计算估计的默认结果是什么——模型是否预测了“正”或“负”的对数几率默认情况下为“负面”结果? Also, what is the default outcome considered for estimation when the dependent variable is此外,当因变量为

  1. Yes or No YesNo
  2. 1 or 2 1 或 2
  3. Pass or Fail PassFail

etc. ?等等。 ?

Is there a rule by which R selects this default? R 是否有规则选择此默认值? Is there a way to override it manually?有没有办法手动覆盖它? Please clarify.请说清楚。

It's in the details of ?binomial :它在?binomial的细节中:

For the 'binomial' and 'quasibinomial' families the response can be specified in one of three ways:对于“二项式”和“拟二项式”族,可以通过以下三种方式之一指定响应:

  1. As a factor: 'success' is interpreted as the factor not having the first level (and hence usually of having the second level).作为一个因素:“成功”被解释为没有第一级的因素(因此通常具有第二级)。 added note : this usually means the first level alphabetically , since this is how R defines factors by default.补充说明:这通常表示按字母顺序排列的第一级,因为这是默认情况下 R 定义因子的方式。

  2. As a numerical vector with values between '0' and '1', interpreted as the proportion of successful cases (with the total number of cases given by the 'weights').作为值介于“0”和“1”之间的数值向量,解释为成功案例的比例(案例总数由“权重”给出)。

  3. As a two-column integer matrix: the first column gives the number of successes and the second the number of failures.作为一个两列整数矩阵:第一列给出成功的次数,第二列给出失败的次数。

So the probability predicted is the probability of "success", ie of the second level of the factor, or the probability of a 1 in the numeric case.所以预测的概率是“成功”的概率,即因子的第二个水平,或数字情况下 1 的概率。

From your examples:从你的例子:

  • Yes or No: the default will be to treat "No" as a failure (because alphabetical), but you can use my_data$my_factor <- relevel(my_data$my_factor,"Yes") to make "Yes" be the first level.是或否:默认将“否”视为失败(因为按字母顺序排列),但您可以使用my_data$my_factor <- relevel(my_data$my_factor,"Yes")使“是”成为第一级。
  • 1 or 2: this will either fail or produce bogus results. 1 或 2:这将失败或产生虚假结果。 Either make the variable into a factor ("1" will be treated as the first level) or subtract 1 to get a 0/1 variable (or use 2-x if you want 2 to be treated as a failure)要么将变量变成一个因子(“1”将被视为第一级)或减去 1 以获得 0/1 变量(如果您希望 2 被视为失败,则使用2-x
  • Pass or Fail: see "Yes or No" ...通过或失败:参见“是或否”...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM