I have to calculate some products in the form A'A
or more general A'DA
, where A
is a general mxn
matrix and D
is a diagonal mxm
matrix. Both of them are full rank; ie rank(A)=min(m,n)
.
I know that you can save a substantial time is such symmetric products: given that A'A
is symmetric, you only have to calculate the lower --or upper-- diagonal part of the product matrix. That adds to n(n+1)/2
entries to be calculated, which is roughly the half of the typical n^2
for large matrices.
This is a great saving that I want to exploit, and I know I can implement the matrix-matrix multiply within a for
loop . However, so far I have been using BLAS, which is much faster than any for
loop implementation that I could write by myself, since it optimizes cache and memory management.
Is there a way to efficiently compute A'A
or even A'DA
using BLAS? Thanks!
You are looking for dsyrk
subroutine of BLAS.
As described in the documentation:
SUBROUTINE dsyrk(UPLO,TRANS,N,K,ALPHA,A,LDA,BETA,C,LDC)
DSYRK performs one of the symmetric rank k operations
C := alpha*A*A**T + beta*C
,or
C := alpha*A**T*A + beta*C
,where alpha and beta are scalars, C is an n by n symmetric matrix and A is an n by k matrix in the first case and ak by n matrix in the second case.
In the case of A'A
storing upper triangular is:
CALL dsyrk( 'U' , 'T' , N , M , 1.0 , A , M , 0.0 , C , N )
For the A'DA
there is no direct equivalent in BLAS. However you can use dsyr
in a for loop.
SUBROUTINE dsyr(UPLO,N,ALPHA,X,INCX,A,LDA)
DSYR performs the symmetric rank 1 operation
A := alpha*x*x**T + A
,where alpha is a real scalar, x is an n element vector and A is an n by n symmetric matrix.
do i = 1, M
call dsyr('U',N,D(i,i),A(1,i),M,C,N)
end do
The dsyrk
routine in BLAS suggested by @ztik is the one for A'A
. For A'DA
, one possibility is to use the dsyr2k
routine which can perform the symmetric rank 2k operations:
C := alpha*A**T*B + alpha*B**T*A + beta*C.
Set alpha = 0.5, beta = 0.0
, and let B = DA
. Note that this way assumes your diagonal matrix D
is real.
SYRK is fine for A'A. For A'DA you can use SYMM for one side of it eg V = A'D and then use intel MKL's GEMMT for W = V A. GEMMT is like GEMM except it takes advantage of the fact that the result matrix is symmetric, and thus only has to do roughly half the work.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.