简体   繁体   中英

Running error within function using solve() in R?

With this program below, I will get the error:

solve.default(Sigma0[cs.idx, cs.idx]) : 'a' is 0-diml

But, when I check the em() function step by step, I mean, sentence by sentence without function, there is no error within solve() . So I am confused and desperate for help, Thank you!

###----------------------------------------------------------------
### Maximal Likelihood estimation of mean and covariance 
###  for multivariate normal distribution by EM algorithm,
###             for demonstration purposes only
###----------------------------------------------------------------

em<-function(xdata,mu0,Sigma0){
  n<-nrow(xdata)
  p<-ncol(xdata)

   err<-function(mu0,Sigma0,mu1,Sigma1){
    th0<-c(mu0,as.vector(Sigma0))
    th1<-c(mu1,as.vector(Sigma1))
    sqrt(sum((th0-th1)*(th0-th1)))
   }

  mu1<-mu0+1
  Sigma1<-Sigma0+1
  while(err(mu0,Sigma0,mu1,Sigma1)>1e-6){
     mu1<-mu0
     Sigma1<-Sigma0

     zdata<-xdata
     Ai<-matrix(0,p,p)
    for(i in 1:n){
     if(any(is.na(xdata[i,]))){
      zi<-xdata[i,]
      na.idx<-(1:p)[is.na(zi)]
      cs.idx<-(1:p)[-na.idx]
      Sigma012<-Sigma0[na.idx,cs.idx,drop=FALSE]
      Sigma022.iv<-solve(Sigma0[cs.idx,cs.idx])
      zdata[i,na.idx]<-mu0[na.idx]+(Sigma012%*%Sigma022.iv)%*%(zi[cs.idx]-mu0[cs.idx])
      Ai[na.idx,na.idx]<-Ai[na.idx,na.idx]+Sigma0[na.idx,na.idx]-Sigma012%*%Sigma022.iv%*%t(Sigma012)
       }
     }
    mu0<-colMeans(zdata)
    Sigma0<-(n-1)*cov(zdata)/n+Ai/n
   }
 return(list(mu=mu0,Sigma=Sigma0))
}

##A simulation example
library(MASS)
set.seed(1200)
p=3
mu<-c(1,0,-1)
n<-1000
Sig <- matrix(c(1, .7, .6, .7, 1, .4, .6, .4, 1), nrow = 3)
triv<-mvrnorm(n,mu,Sig)
misp<-0.2   #MCAR probability
misidx<-matrix(rbinom(3*n,1,misp)==1,nrow=n)
triv[misidx]<-NA 

#exclude the cases whose entire elements were missed
er<-which(apply(apply(triv,1,is.na),2,sum)==p)   
if(length(er)>=1) triv<-triv[-er,]

#initial values
mu0<-rep(0,p)
Sigma0<-diag(p)
system.time(rlt<-em(triv,mu0,Sigma0))

#a better initial values
mu0<-apply(triv,2,mean,na.rm=TRUE)
nas<-is.na(triv)
na.num<-apply(nas,2,sum)
zdata<-triv
zdata[nas]<-rep(mu0,na.num)
Sigma0<-cov(zdata)

system.time(rlt<-em(triv,mu0,Sigma0))

Your er<-which(apply(apply(triv,1,is.na),2,sum)==) piece of code is not valid. As a comment above it states, you wish to remove complete NA cases. If so, er<-which(apply(apply(triv,1,is.na),2,sum)==ncol(triv)) is the right piece of code.

The error itself happens when there is a complete NA case still present in triv when being passed to em . At some point, cs.idx is empty, so Sigma0[cs.idx,cs.idx] is also empty, which is reflected by the error message.

However, if the correction above is applied, everything runs fine:

> system.time(rlt<-em(triv,mu0,Sigma0))
   user  system elapsed 
   0.46    0.00    0.47 
> rlt
$mu
[1]  0.963058487 -0.006246175 -1.024260183

$Sigma
          [,1]      [,2]      [,3]
[1,] 0.9721301 0.6603700 0.5549126
[2,] 0.6603700 1.0292379 0.3745184
[3,] 0.5549126 0.3745184 0.9373208

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM