确保正定协方差矩阵

Question

The outputs of my neural network act as the entries of a covariance matrix.我的神经网络的输出充当协方差矩阵的条目。 However, a one to one corresponde between outputs and entries results in not positive definite covariance matrices.然而，输出和条目之间的一对一对应导致非正定协方差矩阵。

Thus, I read https://www.quora.com/When-carrying-out-the-EM-algorithm-how-do-I-ensure-that-the-covariance-matrix-is-positive-definite-at-all-times-avoiding-rounding-issues and https://en.wikipedia.org/wiki/Cholesky_decomposition , more specificially "When A has real entries, L has real entries as well and the factorization may be written A = LL^T ".因此，我阅读了https://www.quora.com/When-carrying-out-the-EM-algorithm-how-do-I-ensure-that-the-covariance-matrix-is-positive-define-at- all-times-avoiding-rounding-issues和https://en.wikipedia.org/wiki/Cholesky_decomposition ，更具体地说“当 A 有真实条目时，L 也有真实条目，分解可以写成A = LL^T ”。

Now my outputs corresponds to the entries of the L matrix and then I generate the covariance matrix by multiplying it by its transpose.现在我的输出对应于 L 矩阵的条目，然后我通过将它乘以它的转置来生成协方差矩阵。

However, sometimes I still have an error with a not positive definite matrix.但是，有时我仍然会遇到非正定矩阵的错误。 How is this possible?这怎么可能？

I found a matrix that produces an error, see我发现了一个产生错误的矩阵，请参阅

print L.shape
print Sigma.shape

S = Sigma[1,18,:,:] # The matrix that gives the error
L_ = L[1,18,:,:]
print L_
S = np.dot(L_,np.transpose(L_))
print S
chol = np.linalg.cholesky(S)

gives as output:作为输出给出：

(3, 20, 2, 2)
(3, 20, 2, 2)
[[ -1.69684255e+00   0.00000000e+00]
 [ -1.50235415e+00   1.73807144e-04]]
[[ 2.87927461  2.54925847]
 [ 2.54925847  2.25706792]]
.....
LinAlgError: Matrix is not positive definite

However, this code with copying the values works fine (but probably not exact the same values because not all decimals are printed)但是，此代码复制值工作正常（但可能不完全相同的值，因为并非所有小数都被打印）

B = np.array([[-1.69684255e+00, 0.00000000e+00], [-1.50235415e+00, 1.73807144e-04]])
A = np.dot(B,B.T)
chol_A = np.linalg.cholesky(A)

So questions are:所以问题是：

Is the method of using Sigma = LL' correct (with ' the transpose)?使用 Sigma = LL' 的方法是否正确（使用 ' 转置）？
If yes, why I am getting an error?如果是，为什么我收到错误？ Could this be due to rounding issues?这可能是由于舍入问题吗？

Edit: I also computed the eigenvalues编辑：我还计算了特征值

print np.linalg.eigvalsh(S)
[ -7.89378944432428397703915834426880e-08
   5.13634252548217773437500000000000e+00]

And for the second case对于第二种情况

print np.linalg.eigvalsh(A)
[  1.69341869415973178547574207186699e-08
   5.13634263409323210680668125860393e+00]

So there is a slight negative eigenvalue for the first case, which declares the non positive definiteness.所以第一种情况有一个轻微的负特征值，它声明了非正定性。 But how to solve this?但是如何解决这个问题呢？

Answer 1

This looks like a numerical issue, however in general it is not true that LL' will always be positive definite (it will be iff L is invertible).这看起来像是一个数值问题，但一般来说，LL' 总是正定的（如果 L 是可逆的，它将是真的）是不正确的。 For example take L as a matrix where each column is [1 0 0 0 ... 0] (or even more extreme - take L to be a zero matrix of arbitrary dimensionality), the LL' won't be PD.例如，将 L 作为矩阵，其中每一列都是 [1 0 0 0 ... 0] （或者甚至更极端 - 将 L 视为任意维数的零矩阵），则 LL' 不会是 PD。 In general I would recommend doing一般来说，我会建议做

S = LL' + eps I

which takes care of both problems (for small eps), and is a 'regularized' covariance estimate.它解决了这两个问题（对于小 eps），并且是“正则化”协方差估计。 You can even go for "optimal" (under some assumtpions) value of eps by using Ledoit-Wolf estimator.您甚至可以使用 Ledoit-Wolf 估计器来获得 eps 的“最佳”（在某些假设下）值。

Answer 2

I suspect that the computation of L*L' is being done with floats in the first case and with doubles in the second.我怀疑L*L'的计算在第一种情况下是用浮点数完成的，而在第二种情况下是用双精度数完成的。 I have tried taking your L as a float matrix, computing L*L ' and finding its eigenvalues, and I get the same values you do in the first case, but if I convert L to a matrix of doubles, compute L*L' and find the eigenvalues I get the same values as you do in the second case.我试过把你的 L 作为一个浮点矩阵，计算L*L ' 并找到它的特征值，我得到了你在第一种情况下所做的相同的值，但如果我将 L 转换为双精度矩阵，计算L*L'并找到与您在第二种情况下得到的值相同的特征值。

This makes sense, as in the computation of L*L' [1,1] the square of 1.73807144e-04 will, in floats, be negligeable compared to the square of -1.50235415e+00.这是有道理的，因为在L*L' [1,1] 的计算中，与 -1.50235415e+00 的平方相比，1.73807144e-04 的平方在浮点数中可以忽略不计。

If I'm right the solution is to convert L to a matrix of doubles before any computation.如果我是对的，解决方案是在任何计算之前将 L 转换为双精度矩阵。

确保正定协方差矩阵

问题描述

2 个解决方案

解决方案1
5 2016-11-13 15:12:44

解决方案2
1 2016-11-15 17:25:29

确保正定协方差矩阵

问题描述

2 个解决方案

解决方案1 5 2016-11-13 15:12:44

解决方案2 1 2016-11-15 17:25:29

解决方案1
5 2016-11-13 15:12:44

解决方案2
1 2016-11-15 17:25:29