零膨胀负二项分布 function NaN 警告

Question

I am trying to fit my data to a zero-inflated negative binomial model but one of my 3 independent variables (Exposure) seems to be causing NaNs to be produced (see very end of zeroinfl call) when the SE is being calculated in the summary function. I have also tried running a negative binomial hurdle model and am running into a similar issue.我正在尝试将我的数据拟合为零膨胀的负二项式 model 但当在摘要中计算 SE 时，我的 3 个自变量之一（曝光）似乎导致产生 NaN（请参阅 zeroinfl 调用的最后） function。我也尝试过运行负二项式障碍 model 并且遇到了类似的问题。

str(eggTreat)
'data.frame':   455 obs. of  4 variables:
 $ Exposure : Factor w/ 2 levels "C","E": 2 2 2 2 2 2 2 2 2 2 ...
 $ hi_lo    : Factor w/ 2 levels "hi","lo": 2 2 2 2 2 2 2 2 2 2 ...
 $ Egg_count: int  0 0 0 0 0 0 0 0 0 0 ...
 $ Food     : Factor w/ 2 levels "1.5A5YS","5ASMQ": 2 2 2 2 2 2 2 2 2 2 ...

mod.zeroinfl <- zeroinfl(Egg_count ~ Food+Exposure+hi_lo | Food+Exposure+hi_lo, data=eggTreat,
+                          dist="negbin")
> summary(mod.zeroinfl)

Call:
zeroinfl(formula = Egg_count ~ Food + Exposure + hi_lo | Food + Exposure + hi_lo, data = eggTreat, dist = "negbin")

Pearson residuals:
     Min       1Q   Median       3Q      Max 
-0.65632 -0.47163 -0.28588  0.02976  9.00804 

Count model coefficients (negbin with log link):
            Estimate Std. Error z value Pr(>|z|)    
(Intercept) -0.04435    0.14393  -0.308   0.7580    
Food        -1.12486    0.22267  -5.052 4.38e-07 ***
Exposure    -2.34990    0.38684  -6.075 1.24e-09 ***
hi_lo       -0.44893    0.19524  -2.299   0.0215 *  
Log(theta)  -0.24387    0.22639  -1.077   0.2814    

Zero-inflation model coefficients (binomial with logit link):
              Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.830e+01         NA      NA       NA
Food        -5.768e+00  5.628e+04       0        1
Exposure     4.612e-01         NA      NA       NA
hi_lo       -7.477e+00  9.963e+05       0        1
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Theta = 0.7836 
Number of iterations in BFGS optimization: 21 
Log-likelihood: -350.2 on 9 Df
Warning message:
In sqrt(diag(object$vcov)) : NaNs produced

function (object, ...) 
{
  object$residuals <- residuals(object, type = "pearson")
  kc <- length(object$coefficients$count)
  kz <- length(object$coefficients$zero)
  se <- sqrt(diag(object$vcov))

Answer 1

This problem is typically caused by complete separation ;这个问题通常是由完全分离引起的； using this search term, or searching for the Hauck-Donner effect, will show you that the problem is that there is some linear combination of your predictor values that perfectly separates the zeros and non-zeros (since the predictor variables in your zero-inflation are all categorical, this translates to a combination of categories where all the values are zero or non-zero).使用这个搜索词，或者搜索Hauck-Donner效应，会告诉你问题是你的预测值有一些线性组合可以完美地区分零和非零（因为你的零通货膨胀中的预测变量都是分类的，这转换为所有值为零或非零的类别组合）。

I would take a look at with(eggTreat, table(eggcount>0, Food, Exposure, hi_lo)) (arrange the arguments in whatever order makes the table easiest to read).我会看一下with(eggTreat, table(eggcount>0, Food, Exposure, hi_lo)) （以任何顺序排列 arguments，使表格最容易阅读）。

Typical symptoms include:典型症状包括：

large values of the parameters (eg |beta|>10 );较大的参数值（例如|beta|>10 ）； in this case your intercept is -18.3, which gives a predicted zero-inflation probability of 1e-8 in the baseline category (two of the other values are also large, although not nearly as extreme as the intercept)在这种情况下，您的截距是 -18.3，这给出了基线类别中1e-8的预测零通胀概率（其他两个值也很大，尽管没有截距那么极端）
extremely large standard errors ( Food , hi_lo ), leading to z-values that are effectively zero and p-values of effectively 1极大的标准误差（ Food ， hi_lo ），导致 z 值实际上为零，p 值实际上为 1
or... the NA values you're seeing或...您看到的NA值

There are various solutions to this problem:这个问题有多种解决方案：

different forms of regularization or Bayesian priors正则化或贝叶斯先验的不同 forms
compute the p-values using model comparison/likelihood ratio tests使用 model 比较/似然比检验计算 p 值

Zero-inflation model coefficients (binomial with logit link):
              Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.830e+01         NA      NA       NA
Food        -5.768e+00  5.628e+04       0        1
Exposure     4.612e-01         NA      NA       NA
hi_lo       -7.477e+00  9.963e+05       0        1

Answer 2

An answer to a similar problem has been posted here CrossValidated-NA-ZINB but what I found useful was to recalibrate my variables: eg.此处已发布类似问题的答案CrossValidated-NA-ZINB但我发现有用的是重新校准我的变量：例如。 I had the number of hectares of forest in a village that ranged 0 - 100,000 and turned them into hundreds of sqkm that ranged from 0 - 10 and the NaN that were shown for Std.我有一个村庄的森林公顷数，范围为 0 - 100,000，并将它们转换为数百平方公里，范围为 0 - 10 和标准显示的 NaN。 Error, z value and Pr(>|z|) are now valid numbers错误，z 值和 Pr(>|z|) 现在是有效数字

零膨胀负二项分布 function NaN 警告

问题描述

2 个解决方案

解决方案1
0 已采纳 2020-11-23 06:01:45

解决方案2
0 2023-01-19 00:54:50

零膨胀负二项分布 function NaN 警告

问题描述

2 个解决方案

解决方案1 0 已采纳 2020-11-23 06:01:45

解决方案2 0 2023-01-19 00:54:50

解决方案1
0 已采纳 2020-11-23 06:01:45

解决方案2
0 2023-01-19 00:54:50