简体   繁体   中英

How to find the pseudo-inverse of a large sparse matrix?

I have to invert a large sparse matrix (50000 x 12000). It was initially stored as numpy.ndarray and the size of the matrix was around 3.5 GB. I have tried inverting this matrix using numpy.linalg.pinv but it crashes the jupyter notebook kernel. Converting this numpy.ndarray to scipy.sparse.csr_matrix (sparse matrix format) works, but I am unaware of any function that can calculate the pseudo-inverse of csr_matrix.

How do I find the pseudo-inverse of a large sparse matrix?

The inverse or pseudoinverse of a sparse matrix is not necessarily sparse, so you'd have to store a full matrix of a similar size anyway when computing pinv along with multiple intermediate steps. Do you really absolutely need the pseudoinverse explicitly?

We can solve systems also via eg numpy.linalg.lsts and scipy.linalg.lstsq that do not require to find the (pseudo) inverse of a matrix explicitly, which is on the one hand way less expensive in terms of memory and computation but also numerically more stable.

Finally you could also use above functions to compute a pseudo inverse column by column, by solving minimizing the 2-norm

|| A*x - e_j ||

for all vectors e_j = (0, 0, ..., 0, 1, 0, ..., 0) and save them separately.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM