简体   繁体   English

如何修复零膨胀泊松回归的错误

[英]How to fix an error with zero-inflated Poisson regression

I ran zero-inflated Poisson regression with package pscl and came across a same error with this post我用包pscl运行了零膨胀泊松回归,并在这篇文章中遇到了同样的错误

However, since I know there is a separate process for excess zeros indicated by z , does it still make sense to just run Poisson as a solution (Poisson results are just fine)?但是,由于我知道z指示的多余零有一个单独的过程,将泊松作为解决方案运行是否仍然有意义(泊松结果很好)? Is there an alternative way to fix this problem for ZIP?有没有其他方法可以解决 ZIP 的这个问题? I also tried zero-inflated negative binomial regression but it got the same error.我也尝试过零膨胀负二项式回归,但它得到了同样的错误。 Thanks.谢谢。

Call:
zeroinfl(formula = y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 | z, data = df)

Pearson residuals:
     Min       1Q   Median       3Q      Max 
-2.48465 -0.06156 -0.06126 -0.06091  5.57840 

Count model coefficients (poisson with log link):
                 Estimate Std. Error z value Pr(>|z|)
(Intercept)      3.547e+00         NA      NA       NA
x1              -3.251e-02         NA      NA       NA
x2               6.290e-03         NA      NA       NA
x3               8.867e-01         NA      NA       NA
x4               1.432e-01         NA      NA       NA
x5               2.705e-01         NA      NA       NA
x6              -8.223e-10         NA      NA       NA
x7              -7.218e-02         NA      NA       NA
x8               3.322e-02         NA      NA       NA
x9              -2.072e-01         NA      NA       NA

Zero-inflation model coefficients (binomial with logit link):
            Estimate Std. Error z value Pr(>|z|)
(Intercept)    5.531         NA      NA       NA
z              158.108         NA      NA       NA
Error in if (getOption("show.signif.stars") & any(rbind(x$coefficients$count,  : 
  missing value where TRUE/FALSE needed

It's hard to answer this without a reproducible example, but I'll offer a couple of observations (too long for a comment):如果没有可重复的示例,很难回答这个问题,但我将提供一些观察结果(评论太长):

  • pscl 's default behaviour is to use the same formula for both the zero-inflated and the count (conditional) part of the model. pscl的默认行为是对模型的零膨胀部分和计数(条件)部分使用相同的公式。 Unless you have an extremely large data set, you are very likely to have trouble fitting a 10-parameter model (intercept + 9 covariates) to both the count and zero-inflation aspects of the data.除非您有一个非常大的数据集,否则您很可能无法将 10 参数模型(截距 + 9 个协变量)拟合到数据的计数和零通胀方面。 (A reasonable rule of thumb is that you should have 20 times as many observations as parameters, so that's a minimum of 400 observations -- and that rule is probably conservative for estimating zero-inflation.) (一个合理的经验法则是,你的观察值应该是参数的 20 倍,所以至少有 400 个观察值——而且这个规则对于估计零通胀可能是保守的。)
  • one of your parameter estimates ( x6 ) is approximately zero, suggesting that you don't have enough variation in your data to estimate that parameter (or that there is some other issue with this covariate, eg you have an extreme outlier in this dimension).您的一个参数估计值 ( x6 ) 大约为零,这表明您的数据中没有足够的变化来估计该参数(或者该协变量存在其他一些问题,例如您在该维度中有一个极端异常值) . This could easily mess up the standard errors etc. for your whole model.这很容易弄乱整个模型的标准错误等。

General advice:一般建议:

  • plot your data绘制你的数据
  • find a reasonably complex model that you can actually fit by splitting the difference between an over-complicated model that breaks and an over-simplified model that misses important phenomena, as in this figure:找到一个合理复杂的模型,您可以通过拆分会损坏的过度复杂模型和忽略重要现象的过度简化模型之间的差异来实际拟合,如下图所示:

在此处输入图片说明

Uriarte and Yackulic, Ecological Applications, 19(3), 2009, pp. 592–596 Uriarte 和 Yackulic,生态应用,19(3),2009,第 592–596 页

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM