[英]R: NaNs produced in Log-Likelihood optimisation

I am working on identifying svar models and do have some questions about the following looklikehood function 我正在确定svar模型,并且对以下外观函数有一些疑问

http://imgur.com/msJZkB4 http://imgur.com/msJZkB4

We want to estimate the values of B_22, which is a 2x2 matrix, and Omega (in my code I call this matrix „L“), which has the unknown parameters on the diagonal and zero else. 我们要估计B_22的值,这是一个2x2的矩阵,以及Omega(在我的代码中,我将此矩阵称为“ L”),其对角线参数未知,其他参数为零。 Thus: 从而:

B <- matrix(c(theta[1:4]),nrow=2,ncol=2)

(edit:Omega:) (编辑:欧米茄:)

L <- matrix(c(theta[5],0,0,theta[6]),nrow=2,ncol=2)

Sigma1 and Sigma2 are known and estimate with a 2xn vector "u", which has been choosen arbitrarily. Sigma1和Sigma2是已知的,并使用2xn向量“ u”进行估算,该向量已任意选择。

  1. When I compute the code, I get dozens of errors because the log(det(.)) produces negative values. 当我计算代码时,由于log(det(。))产生负值,我会遇到很多错误。 This should not be possible since B* t(B) and B* L* t(B) are covariance matrices and thus their determinant must be positive. 这应该是不可能的,因为B * t(B)和B * L * t(B)是协方差矩阵,因此它们的行列式必须为正。 I have read several posts about using estimation method L-BFGS-B and restricting the parameters, but (i) applying the lowerbound gives me an error in the source code and (ii) I am not sure whether restricting the parameters in such a way will distort the optimisation results. 我已经阅读了几篇有关使用估计方法L-BFGS-B并限制参数的文章,但是(i)应用下限会使我在源代码中出现错误,并且(ii)我不确定是否以这种方式限制参数会扭曲优化结果。 How can I solve this problem? 我怎么解决这个问题? Is L-BFGS-B-method with restricting parameters the right way to go? 具有限制参数的L-BFGS-B方法是否正确?
  2. My second question is regarding starting values oft he optim() function. 我的第二个问题是关于optim()函数的起始值。 I do have realised that changing the starting values leads to different optimisation outcome. 我确实意识到,更改起始值会导致不同的优化结果。 How do I interpret this result? 我如何解释这个结果? Does that mean that the model I am trying to estimate is not sufficiently identified? 这是否意味着我试图估计的模型没有被充分识别?

I hope the information I have given is enough to answer my questions and my format is readable. 我希望我所提供的信息足以回答我的问题,并且我的格式可读。 Any help or hints (also on my coding, since I am relatively new to programming/R-coding) is appreciated. 任何帮助或提示(由于我对编程/ R编码还比较陌生,所以也对我的编码有所帮助)。

  sig1<-1/63*u[,1:63]%*%t(u[,1:63]) #breaking points 63,28 has been choosen arbirarily
logl<- -(63/2)*(log(det(B%*%t(B)))       +  sum(diag(sig1%*%solve(B%*%t(B))))) -   (28/2)*(log(det(B%*%L%*%t(B)))   +  sum(diag(sig2%*%solve(B%*%L%*%t(B)))))


u[1,]   -2.0739942  -2.152562   6.3569442   8.813618    -4.4750621  -2.20355587 -2.32608476 -5.32235864 -1.1783355  2.3010929   -2.3281323  4.8122883   -0.6523752  2.1975880   4.4731109   -3.880578   2.82303865  -0.29450020 -2.2489995  -7.2447985  4.996482982 1.04475829  0.2690333   -5.4314632  4.5957677   -0.9616699  0.5806076   1.6844795   -5.1626010  -1.2564188  -3.0584362  2.34260683  -1.695052   3.51939426  4.43626989  3.3296631   3.5169510   -2.92703345 -1.4131281  7.66182944  -1.40676753 -3.70130317 -0.9010226  -3.91265962 -0.85604657 0.6541337   -3.3668541  -9.7513509  -2.2203572  0.348708268 -0.51795228 -2.644891   -0.2826551  -3.5819070  -2.2470037  -3.829720   0.7522229   1.57592864  -1.15328558 2.9035609   6.7805296   3.2419771   5.607151    0.4836202   2.6242557   3.4674478   0.3317039   3.661060    -2.7323857  -3.85183300 -13.91937338    -29.9294984 -0.4273221  1.96726064  4.0437405   3.505792    12.0125181  3.7582406   3.7173530   11.0320698  0.2876495   1.7703799   -0.75943651 1.38642025  0.1694661   -0.09183614 -3.4427353  -3.42262435 4.56156149  1.27963086  2.3382191   4.471848294 -1.25201443
u[2,]   -0.4976848  -0.337874   -0.4690339  1.376631    -0.2000215  -0.07479611 -0.09590784 0.01132767  0.0859742   0.2965264   0.1478579   -0.1033833  -0.1089317  -0.3412644  -0.4387209  -0.295550   -0.04845632 0.01153943  0.1252204   0.2691985   0.006748248 0.03430976  0.1910270   0.4734956   0.1047363   0.2233812   0.1416566   -0.1008976  -0.3944692  -0.3639312  -0.1398038  -0.01805854 -0.144216   -0.03937892 -0.09407875 -0.2500851  -0.5786795  -0.05531415 -0.1411416  -0.03722923 -0.01203752 0.07773881  0.2560005   0.06210876  0.09857757  0.1958526   0.3224293   0.4340536   0.1937018   0.001760465 0.04619835  -0.089682   0.2107282   0.2445777   0.2605063   1.170965    0.1568308   -0.03179252 -0.06910847 0.1334419   -0.2214261  -0.2858338  -0.284578   -0.4268307  -0.2218157  -0.1590297  -0.1605659  -0.341321   -0.0825591  -0.09752851 0.08405546  0.3481321   0.3238180   -0.03529309 -0.1181096  -0.080112   -0.2421429  -0.1098334  -0.1149457  -0.0409451  -0.1428287  -0.0220812  -0.06036089 0.04192624  0.0760739   0.07842770  0.1800529   0.07780021  0.04023444  0.04337697  0.1267495   -0.002556303    0.0364775

Optim converges with your initial parameters, so I'm not sure there is a problem. Optim会与您的初始参数收敛,所以我不确定是否有问题。 But, you can also try using the alternative optimization routines and run a simple test to see what parameters are giving warnings 但是,您也可以尝试使用替代的优化例程并运行简单的测试,以查看哪些参数正在发出警告

## Test results with other methods
x1 <- optim(c(17,5,3,4,27,13),LL,method="BFGS",u=u)
ps <- x1$par
x2 <- optim(ps, LL, method="Nelder-Mead", u=u)
x3 <- optim(ps, LL, method="SANN", u=u, control=list(maxit=1e5))

## Try some theta values (arbitrary)
tst <- expand.grid(-1:1, -1:1, -1:1, -1:1, -1:1, -1:1)
res <- apply(tst, 1, function(r) {
    tryCatch ({ 
        LL(r, u) 
    }, error=function(e) print ( r ))

There are warnings with negative/zero values of some thetas (would need to look closer to see which ones). 有些警告带有一些theta的负/零值(将需要仔细看才能看到哪些)。 You could use "L-BFGS-B" to constrain your theta values to always be positive (except for theta[1] which is negative in the fitted parameters). 您可以使用“ L-BFGS-B”将theta值约束为始终为正(theta [1]在拟合参数中为负)。

x4 <- optim(ps, LL, u=u, method="L-BFGS-B", lower=c(-Inf, 0, 0, 0, 0, 0),
            upper=rep(Inf, 6))  # no warnings

Percent differences between estimates and expected values for theta theta的估计值与期望值之间的百分比差异

sig1<-1/63*u[,1:63]%*%t(u[,1:63]) #breaking points 63,28 has been choosen arbirarily
B <- matrix(x1$par[1:4], 2, 2)
sig2hat <- B%*%matrix(c(x1$par[5],0,0,x1$par[6]), 2, 2)%*%t(B)
    expected=c(as.vector(sig1), diag(sig2)),
        as.vector((B%*%B - sig1) / sig1 * 100),
        as.vector((sig2hat-sig2)/sig2*100)[c(2, 4)])
#     estimates    expected       percent
# 1 13.70655766 13.70602733  3.869312e-03
# 2 -0.15511643 -0.16220783 -4.371801e+00
# 3 -0.07984098 -0.16220783 -5.077859e+01
# 4  0.10278075  0.10379904 -9.810168e-01
# 5 55.29850597 55.29852730 -3.833837e-04
# 6  0.03042440  0.03042414  8.412705e-04

