[英]Diagonalizing large sparse matrix with Python/Scipy
I am working with a large (complex) Hermitian matrix and I am trying to diagonalize it efficiently using Python/Scipy. 我正在使用大型(复杂的)埃尔米特矩阵,并且尝试使用Python / Scipy有效地对角化它。
Using the eigh
function from scipy.linalg
it takes about 3s to generate and diagonalize a roughly 800x800 matrix and compute all the eigenvalues and eigenvectors. 使用
eigh
函数, scipy.linalg
需要3s的时间来生成和对角化大约800x800的矩阵,并计算所有特征值和特征向量。
The eigenvalues in my problem are symmetrically distributed around 0 and range from roughly -4 to 4. I only need the eigenvectors corresponding to the negative eigenvalues, though, which turns the range I am looking to calculate into [-4,0). 我的问题中的特征值在0附近对称分布,范围从大约-4到4。我只需要对应于负特征值的特征向量,但这将我要计算的范围变成[-4,0)。
My matrix is sparse, so it's natural to use the scipy.sparse
package and its functions to calculate the eigenvectors via eigsh
, since it uses much less memory to store the matrix. 我的矩阵是稀疏的,因此使用
scipy.sparse
包及其函数通过eigsh
来计算特征向量是很eigsh
,因为它使用的内存要少得多。
Also I can tell the program to only calculate the negative eigenvalues via which='SA'
. 我还可以告诉程序仅通过
which='SA'
计算负特征值。 The problem with this method is, that it takes now roughly 40s to compute half the eigenvalues/eigenvectors. 这种方法的问题在于,现在要花费大约40s的时间才能计算出一半的特征值/特征向量。 I know, that the ARPACK algorithm is very inefficient when computing small eigenvalues, but I can't think of any other way to compute all the eigenvectors that I need.
我知道,在计算较小的特征值时,ARPACK算法效率很低,但是我想不出任何其他方法来计算我需要的所有特征向量。
Is there any way, to speed up the calculation? 有什么办法可以加快计算速度? Maybe with using the shift-invert mode?
也许使用移位反转模式? I will have to do many, many diagonalizations and eventually increase the size of the matrix as well, so I am a bit lost at the moment.
我将不得不做很多对角化,并最终也增加矩阵的大小,所以我现在有点迷茫。
I would really appreciate any help! 我将非常感谢您的帮助!
This question is probably better to ask on http://scicomp.stackexchange.com as it's more of a general math question, rather than specific to Scipy or related to programming. 这个问题最好在http://scicomp.stackexchange.com上提出,因为它更多是一个通用的数学问题,而不是特定于Scipy或与编程相关的问题。
If you need all eigenvectors, it does not make very much sense to use ARPACK. 如果需要所有特征向量,则使用ARPACK并没有多大意义。 Since you need N/2 eigenvectors, your memory requirement is at least
N*N/2
floats; 由于您需要N / 2个特征向量,因此您的内存需求至少为
N*N/2
浮点; and probably in practice more. 在实践中可能还会更多。 Using
eigh
requires N*N+3*N
floats. 使用
eigh
需要N*N+3*N
浮点数。 eigh
is then within a factor of 2 from the minimum requirement, so the easiest solution is to stick with it. 然后
eigh
距离最小需求的2倍以内,因此最简单的解决方案是坚持下去。
If you can process the eigenvectors "on-line" so that you can throw the previous one away before processing the next, there are other approaches; 如果可以“在线”处理特征向量,以便在处理下一个特征向量之前可以将其丢弃,那么还有其他方法。 look at the answers to similar questions on scicomp.
查看有关scicomp的类似问题的答案。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.