简体   繁体   中英

Precision of matrix calculation R vs. Stata

I am trying to determine whether matrices are negative semi-definit or not. For this reason, I check if all eigenvalues are smaller or equal to zero. One example matrix is:

              [,1]          [,2]          [,3]          [,4]
[1,] -1.181830e-05  0.0001576663 -2.602332e-07  1.472770e-05
[2,]  1.576663e-04 -0.0116220027  3.249607e-04 -2.348050e-04
[3,] -2.602332e-07  0.0003249607 -2.616447e-05  3.492998e-05
[4,]  1.472770e-05 -0.0002348050  3.492998e-05 -9.103073e-05

The eigenvalues calculated by stata are 1.045e-12, -0.00001559, -0.00009737, -0.01163805. However, eigenvalues calculated by R are -1.207746e-20, -1.558760e-05, -9.737074e-05, -1.163806e-02. So the last three eigenvalues are very similar, but the first one which is very close to zero is not. With the eigenvalues obtained with stata, the matrix is not semi-definit, but with the eigenvalues obtained with R it is semi-definit. Is there a way I can find out which calculation is more precise? Or might it even be possible to rescale the matrix in order to avoid infinitely small eigenvalues?

Thank you very much in advance. Every hint will be highly appreciated.

You can't expect so much precision from a numerical algorithm using double precision floating point numbers.

You can expect no more than 17 decimal digits, and relative precision loss around zero is not uncommon. That is, given numerical error, 1e-12 and -1e-20 are both mostly indistinguishable from 0.

For instance, for the smallest eigenvalue (using coefficients you give in your comment), I get:

  • R 3.4.1: 5.929231e-21,
  • MATLAB R2017a: 3.412972022812169e-19
  • Stata 15: 3.2998e-20 ( matrix eigenvalues ) or 4.464e-19 ( matrix symeigen )
  • Intel Fortran with MKL ( DSYEV function): 2.2608e-19

You may choose a threshold, say 1e-10, and force an eigenvalue to zero when its ratio to the largest eigenvalue is less than 1e-10.

Anyway, your 1e-12 looks a bit large. You may have lost some precision when transferring data between Stata and R: a small relative error in the matrix can result in large relative error for eigenvalues arount zero. With Stata and the data in your question (not in the comment), I get for instance 3.696e-12 for the smallest eigenvalue.

However, even with the same matrix, there may still be differences (there are, above), due to variations in:

  • the parser, if you enter your numbers as text
  • the algorithm used for eigenvalue computation
  • implementation details of the same algorithm (floating-point operators are not associative, for instance)
  • the compiler used to compile the computation routines, or compiler options
  • floating-point hardware

The traditionnal suggested reading for this kind of question:

What Every Computer Scientist Should Know About Floating-Point Arithmetic

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM