简体   繁体   English

我如何用二项式误差分布解释 glm 的系数?

[英]How do i Interpret the coefficients of glm with binomial error distribution?

I would be happy if someone could help me understand glm with binominal error distribution.如果有人可以帮助我理解带有二项式误差分布的 glm,我会很高兴。

Lets assume the following df:让我们假设以下df:

year<-c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
        3, 3, 3, 3, 3, 3, 3, 3, 3, 3,3, 3, 3, 3, 3, 3, 3, 3)


success<-c(1,  0,  3,  1,  1,  2,  6,  0,  1,  1, 12,  2, NA,  6, 12,  0, 10,
           7,  4, 10, 13,  1,  2,  1, 18,  6,  3,  8,  3,  1,  9, 15,  6, 12,
           6, 15, 13,  6,  8,  6,  2, 11,  6, 1, 12,  0,  4, 15,  0,  3, 18,
           5,  6, 17,  5,  3, 17,  8,  0,  7, 12, 10, 26, 12,  4, 17,  1,  8,
           2,  7, 14,  8)

no_success<-c(1,  9,  5,  4,  6,  1,  4,  4,  6, 10, 16,  4, NA,  3, NA,  3,
              5,  5,  6, 10,  0,  5,  3, 10,  1,  7, 11,  8, 20,  4,  3,  3,
              19,  1, 11,  4,  6,  4,  9,  4, 10,  4,  2, 8,  3,  1, 13,  3,
              5,  7,  5,  9,  3,  6,  3,  4,  3, 13,  6,  5, 10,  3,  1,  0,
              18,  6, 13,  0,  3,  2,  2,  2)


df<-data.frame(year,success,no_success)

df$success<-as.integer(df$success)
df$no_success<-as.integer(df$no_success)

If I want to know if there is a linear increase or decrease between year in regards to the success or no_success of a thought up treatment I apply a binominal glm:如果我想知道关于思考治疗的成功不成功年份之间是否存在线性增加或减少,我会应用二项式 glm:

m<- glm(cbind(success, no_success)~year,
        data=df, family = "quasibinomial",
        na.action=na.exclude)
summary(m)

I changed to "quasibinomial" here because of overdispersion.由于过度分散,我在这里改为“准二项式”。

From the summary I see that there is a significant effect: P: 0.0219 *总结中我看到有显着的效果:P: 0.0219 *

As the coefficients in a binomial glm represent log odds, I get exp(estimate) = exp(0.3099) = 1.363由于二项式 glm 中的系数代表对数几率,我得到 exp(estimate) = exp(0.3099) = 1.363

So, there is an increase in Odds of succes of 1.363 per year因此,每年成功的几率增加 1.363

My Questions are:我的问题是:

1.) When I exp(negative estimate) it gets always positive - this can not be correct. 1.)当我 exp(负估计)它总是积极的 - 这不可能是正确的。 There must be a way to express negative relationships.必须有一种表达消极关系的方法。

2.) When I want to visualize multiple linear models, I like to display the estimates. 2.) 当我想可视化多个线性模型时,我喜欢显示估计值。 In a "normal" lm I would display the estimate and confidence interval like this: divide the estimate by the mean of the observation and than substract and add the mean of observation/Std.在“正常”lm中,我会像这样显示估计值和置信区间:将估计值除以观测值的平均值,然后减去并加上观测值/标准的平均值。 Error times 1.96.错误时间为 1.96。

  Estimate.mean<-exp(0.3099)/mean(df$or,na.rm=TRUE)
  
  Std.Error.mean<-exp(0.1321)/mean(df$or,na.rm=TRUE)
  
  
  low<-Estimate.mean-Std.Error.mean*1.96
  high<-Estimate.mean+Std.Error.mean*1.96

If this confidence level is not touching the zero line it should be significant.如果这个置信水平没有触及零线,它应该是显着的。 The effect is significantly not greater than zero.效果显着不大于零。

But here the low bound is -0.3901804 and the high bound is 1.608095 .但这里的下限是-0.3901804 ,上限是1.608095 This does not appear to be a significant linear relationship despite the low p-value from the glm ( 0.0219 ).尽管 glm 的 p 值较低 ( 0.0219 ),但这似乎不是显着的线性关系。

What have I mixed up here?我在这里混淆了什么?

I am happy for any suggestions我很高兴有任何建议

The "zero line" in this case is x=1 and not x=0.在这种情况下,“零线”是 x=1 而不是 x=0。

Question 2: the question is.问题2:问题是。 Is there a effect that is different from zero?是否存在与零不同的影响? But odds of 1 basicaly means zero.但是 1 的几率基本上意味着零。

Question 1: When the estimate is exp the reslust kann not be negative.But odds below 1 express a negative effect.问题 1:当估计为 exp 时,reslust kann 不是负数。但低于 1 的赔率表示负效应。

Here are some sources to calculate the confidence intervall for anyone stumbling over this post.这里有一些资源可以计算任何绊倒这篇文章的人的置信区间。

https://fromthebottomoftheheap.net/2018/12/10/confidence-intervals-for-glms/ https://fromthebottomoftheheap.net/2018/12/10/confidence-intervals-for-glms/

https://stats.stackexchange.com/questions/304833/how-to-calculate-odds-ratio-and-95-confidence-interval-for-logistic-regression https://stats.stackexchange.com/questions/304833/how-to-calculate-odds-ratio-and-95-confidence-interval-for-logistic-regression

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM