简体   繁体   中英

Error in jags.model node inconsistent with parents

I'm using a beta-binomial n-mixture model to estimate abundance. The code that I have has worked well on both simulated and natural datasets. However, when I run a particular data set this error pops up Error in jags.model(etc...) Error in node N[1] Node inconsistent with parents. I'm wondering why it has a problem with this dataset and not the others. There are a fair amount of zeros in the dataset that does not work, but there are more in the dataset that does work also has a fair amount of 0's. I've played around with the prior values, ranging them from 0.001 to 10. The data structure and model is as follows

n.site <- 6 
R <- n.site #number of sites
T <- 10 #number of replicate counts
y <- array(dim = c(R, T)) #null array

y <- array(sur.2019$Num_total, dim = c(R, T)) #populate y
C <- c(y)
C <- as.numeric(C) 


site = 1:R              
site.p <- rep(site, T)

sink("Model.txt")
cat("
    model {
    
    # Priors
    lam~dgamma(0.01,0.01)
    alpha.p~dgamma(0.01,0.01)
    beta.p~dgamma(0.01,0.01)
    
    # Likelihood
    #the next four lines are used to model N as a zero-truncated distribution
    probs[R+1]<- 1-sum(probs[1:R])
    for(i in 1:R){
    probs[i]<- exp(-lam)*(pow(lam,x[i]))/(exp(logfact(x[i])) * (1-exp(-lam)))
    N[i] ~ dcat(probs[])}
    
    # Observation model for replicated counts
    for (i in 1:n) {                   
    C[i] ~ dbin(p[i], N[site.p[i]])
    p[i]~dbeta(alpha.p,beta.p)
    
    # Assess model fit using Chi-squared discrepancy
    # Compute fit statistic for observed data
    eval[i] <- p[i]*N[site.p[i]]
    E[i] <- pow((C[i] - eval[i]),2) / (eval[i] + 0.5)
    # Generate replicate data and compute fit stats
    C.new[i] ~ dbin(p[i], N[site.p[i]])
    E.new[i] <- pow((C.new[i] - eval[i]),2) / (eval[i] + 0.5)
    
    } # ends i loop
    
    
    # Derived and other quantities
    totalN <- sum(N[])  # Estimate abundance across all sites
    mean.abundance <- lam #mean expected abundance per plot
    p.derived<-alpha.p/(alpha.p+beta.p) #derived detection probability
    rho.derived<-1/(alpha.p+beta.p+1)  #correlation coefficient
    
    
    fit <- sum(E[])
    fit.new <- sum(E.new[])
    
    }
    ",fill = TRUE)
sink()

R = nrow(y)
T = ncol(y)
n = dim(y)[1] * dim(y)[2]#number of observations (sites*surveys)

nmm.data <- list(C = C, n=n, R = R, site.p = site.p, x=1:R)

# Initial values
Nst <- apply(y, 1, max) + 1 #changed from apply(y, 1, max) + 1
Nst[is.na(Nst)] <- 1
inits <- function(){list(N = Nst, lam = runif(1, 1, 7),alpha.p=runif(1,0.5,1), beta.p=runif(1,0.5,1))}

# Define parameters to be monitored
params <- c("totalN", "mean.abundance", "lam", "p.derived", "rho.derived", "fit", "fit.new","alpha.p","beta.p")

# MCMC settings
ni <- 14000
nt <- 1
nb <- 4000
nc <- 3



abun.1   <- jags(data       = nmm.data,
                 parameters = params,
                 inits      = inits,
                 model      = "model.txt",
                 n.thin     = nt, 
                 n.chains   = nc,
                 n.burnin   = nb,
                 n.iter     = ni)

abun.1.mcmc <- as.mcmc(abun.1)

The dataset that does not work looks like this

nmm.data <- list(C = C, n=n, R = R, site.p = site.p, x=1:R)
nmm.data
$C
 [1] 7 0 1 0 0 0 2 0 0 0 0 1 3 2 0 0 0 0 1 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[55] 0 0 0 0 0 0

$n
[1] 60

$R
[1] 6

$site.p
 [1] 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6
[55] 1 2 3 4 5 6

$x
[1] 1 2 3 4 5 6

While the dataset that does work looks like this

hmm.data <- list(C = C, n=n, R = R, site.p = site.p, x=1:R)
hmm.data

$C
  [1] 0 0 0 0 0 2 0 1 0 0 0 0 0 3 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 0 0 0 0 1 1 0 0 0 0 0 0 1 1 0 1 0 0 0 1 0 0
 [55] 0 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[109] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[163] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[217] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

$n
[1] 60

$R
[1] 6

$site.p
  [1] 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6
 [55] 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6
[109] 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6
[163] 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6
[217] 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6

$x
[1] 1 2 3 4 5 6

As I've said, I've tried to play around with the priors but still get the issue at node N[1]. I'm new to JAGS and am not sure how to approach solving this problem as it doesn't seem to be a coding issue but a data issue. Anyone have any ideas what may be going on? I appreciate anyone taking the time to look over it.

The dput for nmm.data is

list(C = c(7, 0, 1, 0, 0, 0, 2, 0, 0, 0, 0, 1, 3, 2, 0, 0, 0, 
0, 1, 0, 0, 0, 0, 0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0), n = 60L, R = 6L, site.p = c(1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 
3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 
1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 
5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 
3L, 4L, 5L, 6L), x = 1:6)

And for hmm.data it is

list(C = c(0, 0, 0, 0, 0, 2, 0, 1, 0, 0, 0, 0, 0, 3, 0, 1, 0, 
0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 
0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 
0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0), n = 258L, R = 6L, site.p = c(1L, 
2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 
6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 
4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 
2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 
6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 
4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 
2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 
6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 
4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 
2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 
6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 
4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 
2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 
6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 
4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 
2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 
6L), x = 1:6)

After carefully checking your code, I was able to train a bayesian model using nmm.data . I found two issues. Your priors could be producing pretty small values and that is generating issues in the model. You can change that but I did not that. Second issue are your initial values. They are being too large, and that is affecting the performance of the chains. As you have priors with smaller values and large initial values, using a high number of iterations can mess everything. So, what I did when I used the troublesome dataset is reducing the number of iterations (Also your data is small, and when you have large data as the other list, issues are not appearing). Here the code, in the final part I will give you some options to extract results.

First the model (no changes on this, but added a fixed random seed to keep results similar):

library(rjags)
set.seed(123)
#Code for model
mymod <- "

model {
    
    # Priors
    lam~dgamma(0.01,0.01)
    alpha.p~dgamma(0.01,0.01)
    beta.p~dgamma(0.01,0.01)
    
    # Likelihood
    #the next four lines are used to model N as a zero-truncated distribution
    probs[R+1]<- 1-sum(probs[1:R])
    for(i in 1:R){
    probs[i]<- exp(-lam)*(pow(lam,x[i]))/(exp(logfact(x[i])) * (1-exp(-lam)))
    N[i] ~ dcat(probs[])}
    
    # Observation model for replicated counts
    for (i in 1:n) {                   
    C[i] ~ dbin(p[i], N[site.p[i]])
    p[i]~dbeta(alpha.p,beta.p)
    
    # Assess model fit using Chi-squared discrepancy
    # Compute fit statistic for observed data
    eval[i] <- p[i]*N[site.p[i]]
    E[i] <- pow((C[i] - eval[i]),2) / (eval[i] + 0.5)
    # Generate replicate data and compute fit stats
    C.new[i] ~ dbin(p[i], N[site.p[i]])
    E.new[i] <- pow((C.new[i] - eval[i]),2) / (eval[i] + 0.5)
    
    } # ends i loop
    
    
    # Derived and other quantities
    totalN <- sum(N[])  # Estimate abundance across all sites
    mean.abundance <- lam #mean expected abundance per plot
    p.derived<-alpha.p/(alpha.p+beta.p) #derived detection probability
    rho.derived<-1/(alpha.p+beta.p+1)  #correlation coefficient
    
    
    fit <- sum(E[])
    fit.new <- sum(E.new[])
    
    }
    "

Now, we will define same values as you did and the initial values function (I slightly adjusted the function):

#Data
#Values
n.site <- 6 
R <- n.site #number of sites
T <- 10 #number of replicate counts
y <- array(dim = c(R, T)) #null array
# Initial values
Nst <- apply(y, 1, max) + 1 #changed from apply(y, 1, max) + 1
Nst[is.na(Nst)] <- 1
inits <- function(){list(N = Nst, lam = runif(1, 1, 7),
                         alpha.p=runif(1,0.1,0.5),
                         beta.p=runif(1,0.5,1))}

Next, we will define the setting for the model and it will be trained (I did not use some of these values but you can try different settings):

# MCMC settings using nmm.data
ni <- 14000
nt <- 1
nb <- 4000
nc <- 3
#Model
m1 <- jags.model(file=textConnection(mymod),
           data=nmm.data,n.chains=3,inits=inits(),quiet = T)
#Update
update(m1, n.iter=1000,progress.bar = "none")

With previous code, model has been trained and updated. I reduced the number of iterations because of small values you have in your data. Increasing that number can generate infinite values and the training model process will stop. So, by now our model is ready. We will use same function as.mcmc() to create an object you want and coda.samples() function to extract the parameters from each chain:

#Parameters
params <- c("totalN", "mean.abundance", "lam", "p.derived",
            "rho.derived", "fit", "fit.new","alpha.p","beta.p")
#Extract results
res1 <- coda.samples(m1,variable.names=params,n.iter=1000)
#Code for extracting
m1.mcmc <- as.mcmc(m1)
#Example
d1 <- as.data.frame(res1[[1]])

You will have your object m1.mcmc created and also in res1 will be saved the results from each chain for the parameters. I added as example d1 whose output is next (only some rows head(d1) because the output is big):

     alpha.p    beta.p      fit  fit.new      lam mean.abundance  p.derived rho.derived totalN
1 0.04932755 0.5086096 3.390977 3.765055 2.517308       2.517308 0.08841059   0.6418744     16
2 0.06047980 0.5160444 2.218640 3.986287 2.453345       2.453345 0.10490418   0.6343068     18
3 0.05784805 0.4795353 1.286302 2.862089 3.213626       3.213626 0.10764764   0.6504558     17
4 0.05672757 0.3837686 2.980358 2.260288 1.679854       1.679854 0.12878106   0.6942052     15
5 0.05653906 0.8254216 6.293583 2.671746 3.382379       3.382379 0.06410611   0.5313607     15
6 0.06015773 0.5851066 5.566315 2.555359 5.338035       5.338035 0.09322960   0.6078051     26

So, when you use larger data, the chains converge whereas with smaller data as nmm.data be careful about initial values and the number of iterations.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM