简体   繁体   中英

convert a sparse matrix to dense and get the full eigenvalues

recently I'm working on a problem which requires diagonalizing a huge hermitian matrix to get all the eigenvalues. Currently I'm using Mathematica to do the job.

However it is not applicable due to the limitation of memory when the matrix size approaches (2^15,2^15), where the diagonalization costs approximately 32 GBs memory.

I've tried using python by importing the matrix from mathematica,

import numpy as np
from scipy.io import mmread
from scipy.sparse import csc_matrix

#importing sparse matrix to save space
h = mmread("h.mtx")
h = csc_matrix(h)
#diagonlizing the dense one
ev = np.linalg.eigvalsh(h.todense())

It works but unfortunately an order of magnitude slower than Mathematica.

So, is there any other possible solutions, say, C++?

I know nothing about C++ so I guess the simplest way may be importing the matrix to C++ and diagonalizing.

Thanks!

Running some preliminary test using this matrix:

http://math.nist.gov/MatrixMarket/data/NEP/h2plus/qc2534.html

I determined that the conversion to dense does not take up much of the time. The eigenvalue calculation does.

Numpy uses highly-optimized Lapack routines to calculate. These are the same you'd use in C++. Therefore C++ won't give you much of a speedup. If you want a speedup use the sparseness as a property, go to a better computer or switch to a distributed matrix storage(lot's of labor here).

PS: if you do this for a university project you might want to look around if your university has a cluster of some sort. A cluster node typically has lots of memory. If not, check amazons AWS EC2 or googles compute engine for instances with lot's of ram.

Edit:

Here Wolfram says what Mathematica does behind the scenes: http://reference.wolfram.com/language/tutorial/LinearAlgebraAppendix.html#83486633

Arpack is a (arnoldi)subspace solver, giving you only the highest or lowest k-eigenvalues, ATLAS is just a Lapack implementation and the rest seems to be for solving linear systems.

All methods giving you the full eigenspectrum will require the matrix decomposition of a NxN matrix. If you only want k vectors there are methods which reduce it to a decomposition of akx k-matrix.

There are modern alternatives to Arpack( http://slepc.upv.es/ or the one that comes with MKL), but they all give you a subspace.

c++ won't help much. In python you can delegate easily to C++ and a lot of scipy routines will do just that (for performance). I also expect that if you only time the eigen value line you will get similar performance to Matematica and the difference in performance comes from reading the data.

The best solution is to look for a more appropriate algorithm, maybe something that operates on the sparse matrix directly, or decompose the original into smaller matrices and combine them.

To make the original solution more tractable you could try increasing the amount of swap space. In linux it's a dedicated partition, in windows it's a setting. This should allow Matematica/python to use more memory, but it's going to be much slower due to memory trashing. Get an SSD to speed this setup up, but note that it's going to be destroyed faster due to often writes. Or even better buy more RAM.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM