简体   繁体   中英

Estimating AR(1) coefficient using metropolis-Hastings algorithm (MCMC) in R

I am trying to write a program to estimate AR(1) coefficients using metropolis-hastings algorithm. My R code is as following,

set.seed(101)

#loglikelihood
logl <- function(b,data) {
  ly = length(data)
  -sum(dnorm(data[-1],b[1]+b[2]*data[1:(ly-1)],(b[3])^2,log=TRUE))
}


#proposal function
proposalfunction <- function(param,s){
  return(rnorm(3,mean = param, sd= s))
}

#MH sampler
MCMC <- function(startvalue, iterations,data,s){
  i=1
  chain = array(dim = c(iterations+1,3))
  chain[i,] = startvalue
  while (i <= iterations){
    proposal = proposalfunction(chain[i,],s)
    probab = exp(logl(proposal,data = data) - logl(chain[i,],data = data))
    if(!is.na(probab)){
      if (runif(1) <= min(1,probab)){
        chain[i+1,] = proposal
      }else{
        chain[i+1,] = chain[i,]
      }
      i=i+1
    }else{
      cat('\r !')
    }
  }
  acceptance = round((1-mean(duplicated(chain)))*100,1)
  print(acceptance)
  return(chain)
}

#example
#generating data
data <- arima.sim(list(order = c(1,0,0), ar = 0.7), n = 2000,sd = sqrt(1))

r=MCMC(c(0,.7,1),50000,data,s=.00085)

In the example, I must get zero for the mean and 0.7 for the coefficient and 1 for error variance. but everytime I run this code I get completely different values. I tried to adjust proposal scale but still I get the results that are far from the true values. Figure below shows the results.

在此处输入图片说明

You've flipped the sign in your log-likelihood function. This is an easy mistake to make because maximum likelihood estimation usually proceeds by minimizing the negative log-likelihood , but the requirement in MCMC is to be working with the likelihood itself (not its inverse).

Also:

  • dnorm() takes the standard deviation as its third argument, not the variance. You can simplify your code slightly by using head(data,-1) to get all but the last element in a vector. So your log-likelihood would be:
sum(dnorm(data[-1],b[1]+b[2]*head(data,-1),b[3],log=TRUE))
  • you're probably hurting yourself by fixing the candidate distribution to be independent Normal with equal SDs for all variables; allowing them to differ (although the posterior SDs are not as different as I thought they might be - about {0.02,0.016,0.0077} - so this might not be such a big problem.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM