简体   繁体   中英

Is there a most efficient way to multiply three matrices A * B * C = D using cuBLAS?

I want to find the most efficient way to multiple three matrices using cuBLAS. My current solution has the obvious multiple calls to cublasgemm

cublas<t>gemm(cublasH, transa, transb, m, n, k, &alpha, d_A, lda, d_B, ldb, &beta, d_AB, ldc)
cublas<t>gemm(cublasH, transb, transc, m, n, k, &alpha, d_AB, ldab, d_C, ldc, &beta, d_D, ldd)

It's not a my opinion that this a bad solution. Only that it would be better were there some way to do with a single kernel/function call rather than 2, as a single kernel would presumably get a bit more speed up.

I've looked at cublasgemmBatched hoping there were some manipulation to be made, but it's stated that the multiplications must be independent from each other, so that seems off the table.

Is there some way to use cuBLAS or some other mathematical shortcut worth trying to achieve this optimization?

CUBLAS doesn't have any direct support for this (a single function call that accepts 3 matrices to be multiplied together.)

The way to do it in CUBLAS is the way you have already indicated.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM