简体   繁体   English

同时使用numpy对齐矩阵

[英]Simultaneously diagonalize matrices with numpy

I have a m × n × n numpy.ndarray of m simultaneously diagonalizable square matrices and would like to use numpy to obtain their simultaneous eigenvalues. 我有一个m × n × n numpy.ndarraym个可同时对角化的方形矩阵,并希望使用numpy来获得它们的同时特征值。

For example, if I had 例如,如果我有

from numpy import einsum, diag, array, linalg, random
U = linalg.svd(random.random((3,3)))[2]

M = einsum(
    "ij, ajk, lk",
    U, [diag([2,2,0]), diag([1,-1,1])], U)

the two matrices in M are simultaneously diagonalizable, and I am looking for a way to obtain the array M中的两个矩阵同时可以对角化,我正在寻找一种获取阵列的方法

array([[2.,  1.],
       [2., -1.],
       [0.,  1.]])

(up to permutation of the lines) from M . M (直到线的排列)。 Is there a built-in or easy way to get this? 是否有内置或简单的方法来获得这个?

There is a fairly simple and very elegant simultaneous diagonalization algorithm based on Givens rotation that was published by Cardoso and Soulomiac in 1996: 基于Givens旋转的相当简单且非常优雅的同时对角化算法由Cardoso和Soulomiac于1996年出版:

Cardoso, J., & Souloumiac, A. (1996). Cardoso,J。,&Souloumiac,A。(1996)。 Jacobi Angles for Simultaneous Diagonalization. Jacobi Angles用于同时对角化。 SIAM Journal on Matrix Analysis and Applications, 17(1), 161–164. SIAM Journal on Matrix Analysis and Applications,17(1),161-164。 doi:10.1137/S0895479893259546 DOI:10.1137 / S0895479893259546

I've attached a numpy implementation of the algorithm at the end of this response. 我在这个响应结束时附加了算法的numpy实现。 Caveat: It turns out simultaneous diagonalization is a bit of a tricky numerical problem, with no algorithm (to the best of my knowledge) that guarantees global convergence. 警告:事实证明,同时对角化是一个棘手的数值问题,没有算法(据我所知)保证全局收敛。 However, the cases in which it does not work (see the paper) are degenerate and in practice I have never had the Jacobi angles algorithm fail on me. 然而,它不起作用的情况(参见论文)是退化的,在实践中我从来没有让Jacobi角度算法失败。

#!/usr/bin/env python2.7
# -*- coding: utf-8 -*-
"""
Routines for simultaneous diagonalization
Arun Chaganty <arunchaganty@gmail.com>
"""

import numpy as np
from numpy import zeros, eye, diag
from numpy.linalg import norm

def givens_rotate( A, i, j, c, s ):
    """
    Rotate A along axis (i,j) by c and s
    """
    Ai, Aj = A[i,:], A[j,:]
    A[i,:], A[j,:] = c * Ai + s * Aj, c * Aj - s * Ai 

    return A

def givens_double_rotate( A, i, j, c, s ):
    """
    Rotate A along axis (i,j) by c and s
    """
    Ai, Aj = A[i,:], A[j,:]
    A[i,:], A[j,:] = c * Ai + s * Aj, c * Aj - s * Ai 
    A_i, A_j = A[:,i], A[:,j]
    A[:,i], A[:,j] = c * A_i + s * A_j, c * A_j - s * A_i 

    return A

def jacobi_angles( *Ms, **kwargs ):
    r"""
    Simultaneously diagonalize using Jacobi angles
    @article{SC-siam,
       HTML =   "ftp://sig.enst.fr/pub/jfc/Papers/siam_note.ps.gz",
       author = "Jean-Fran\c{c}ois Cardoso and Antoine Souloumiac",
       journal = "{SIAM} J. Mat. Anal. Appl.",
       title = "Jacobi angles for simultaneous diagonalization",
       pages = "161--164",
       volume = "17",
       number = "1",
       month = jan,
       year = {1995}}

    (a) Compute Givens rotations for every pair of indices (i,j) i < j 
            - from eigenvectors of G = gg'; g = A_ij - A_ji, A_ij + A_ji
            - Compute c, s as \sqrt{x+r/2r}, y/\sqrt{2r(x+r)}
    (b) Update matrices by multiplying by the givens rotation R(i,j,c,s)
    (c) Repeat (a) until stopping criterion: sin theta < threshold for all ij pairs
    """

    assert len(Ms) > 0
    m, n = Ms[0].shape
    assert m == n

    sweeps = kwargs.get('sweeps', 500)
    threshold = kwargs.get('eps', 1e-8)
    rank = kwargs.get('rank', m)

    R = eye(m)

    for _ in xrange(sweeps):
        done = True
        for i in xrange(rank):
            for j in xrange(i+1, m):
                G = zeros((2,2))
                for M in Ms:
                    g = np.array([ M[i,i] - M[j,j], M[i,j] + M[j,i] ])
                    G += np.outer(g,g) / len(Ms)
                # Compute the eigenvector directly
                t_on, t_off = G[0,0] - G[1,1], G[0,1] + G[1,0]
                theta = 0.5 * np.arctan2( t_off, t_on + np.sqrt( t_on*t_on + t_off * t_off) )
                c, s = np.cos(theta), np.sin(theta)

                if abs(s) > threshold:
                    done = False
                    # Update the matrices and V
                    for M in Ms:
                        givens_double_rotate(M, i, j, c, s)
                        #assert M[i,i] > M[j, j]
                    R = givens_rotate(R, i, j, c, s)

        if done:
            break
    R = R.T

    L = np.zeros((m, len(Ms)))
    err = 0
    for i, M in enumerate(Ms):
        # The off-diagonal elements of M should be 0
        L[:,i] = diag(M)
        err += norm(M - diag(diag(M)))

    return R, L, err

I am not aware of any direct solution. 我不知道有任何直接的解决方案。 But why not just getting the eigenvalues and the eigenvectors of the first matrix, and using the eigenvectors to transform all other matrices to the diagonal form? 但为什么不只是得到第一个矩阵的特征值和特征向量,并使用特征向量将所有其他矩阵转换为对角线形式? Something like: 就像是:

eigvals, eigvecs = np.linalg.eig(matrix1)
eigvals2 = np.diagonal(np.dot(np.dot(transpose(eigvecs), matrix2), eigvecs))

You can the add the columns to an array via hstack if you like. 如果您愿意,可以通过hstack将列添加到数组中。

UPDATE: As pointed out below, this is only valid if no degenerate eigenvalues occur. 更新:如下所述,这仅在没有退化特征值的情况下有效。 Otherwise one would have to check first for the degenerate eigenvalues, then transform the 2nd matrix to a blockdiagonal form, and diagonalize eventual blocks bigger than 1x1 separately. 否则人们必须首先检查简并本征值,则该第二矩阵变换到blockdiagonal形式,并分别对角化最终块的1x1比更大。

If you know something about the size of the eigenvalues of the two matrices in advance, you can diagonalize a linear combination of the two matrices, with coefficients chosen to break the degeneracy. 如果您事先知道两个矩阵的特征值的大小,则可以对两个矩阵的线性组合进行对角化,并选择系数来打破简并性。 For example, if the eigenvalues of both lie between -10 and 10, you could diagonalize 100*M1 + M2. 例如,如果两者的特征值介于-10和10之间,则可以对角化100 * M1 + M2。 There's a slight loss of precision, but for many purposes it's good enough--and quick and easy! 精确度略有下降,但出于很多目的,它已经足够好了 - 而且快速而简单!

I am sure there is significant room for improvement in my solution, but I have come up with the following set of three functions doing the calculation for me in a semi-robust way. 我确信我的解决方案还有很大的改进空间,但是我已经提出了以下三个函数集,以半稳健的方式为我计算。

def clusters(array,
             orig_indices = None,
             start = 0,
             rtol=numpy.allclose.__defaults__[0],
             atol=numpy.allclose.__defaults__[1]):
    """For an array, return a permutation that sorts the numbers and the sizes of the resulting blocks of identical numbers."""
    array = numpy.asarray(array)
    if not len(array):
        return numpy.array([]),[]
    if orig_indices is None:
        orig_indices = numpy.arange(len(array))
    x = array[0]
    close = abs(array-x) <= (atol + rtol*abs(x))
    first = sum(close)
    r_perm, r_sizes = clusters(
        array[~close],
        orig_indices[~close],
        start+first,
        rtol, atol)
    r_sizes.insert(0, first)
    return numpy.concatenate((orig_indices[close], r_perm)), r_sizes

def permutation_matrix(permutation, dtype=dtype):
    n = len(permutation)
    P = numpy.zeros((n,n), dtype)
    for i,j in enumerate(permutation):
        P[j,i]=1
    return P

def simultaneously_diagonalize(tensor, atol=numpy.allclose.__defaults__[1]):
    tensor = numpy.asarray(tensor)
    old_shape = tensor.shape
    size = old_shape[-1]
    tensor = tensor.reshape((-1, size, size))
    diag_mask = 1-numpy.eye(size)

    eigvalues, diagonalizer = numpy.linalg.eig(tensor[0])
    diagonalization = numpy.dot(
        numpy.dot(
            matrix.linalg.inv(diagonalizer),
            tensor).swapaxes(0,-2),
        diagonalizer)
    if numpy.allclose(diag_mask*diagonalization, 0):
        return diagonalization.diagonal(axis1=-2, axis2=-1).reshape(old_shape[:-1])
    else:
        perm, cluster_sizes = clusters(diagonalization[0].diagonal())
        perm_matrix = permutation_matrix(perm)
        diagonalization = numpy.dot(
            numpy.dot(
                perm_matrix.T,
                diagonalization).swapaxes(0,-2),
            perm_matrix)
        mask = 1-scipy.linalg.block_diag(
            *list(
                numpy.ones((blocksize, blocksize))
                for blocksize in cluster_sizes))
        print(diagonalization)
        assert(numpy.allclose(
                diagonalization*mask,
                0)) # Assert that the matrices are co-diagonalizable
        blocks = numpy.cumsum(cluster_sizes)
        start = 0
        other_part = []
        for block in blocks:
            other_part.append(
                simultaneously_diagonalize(
                    diagonalization[1:, start:block, start:block]))
            start = block
        return numpy.vstack(
            (diagonalization[0].diagonal(axis1=-2, axis2=-1),
             numpy.hstack(other_part)))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM