This might be a throwback to the old BLAS library design, but I was surprised just now to find that CuBlas wastes memory by using regular 2d arrays for triangular matrices. I suppose this makes interfacing with the rest of the API more convenient.
I was surprised just now to find that CuBlas wastes memory by using regular 2d arrays for triangular matrices
That isn't strictly true.
If you look at the Level 2 BLAS routines, you will see that they operate on triangular or Hermitian matrices stored in a packed format.
The Level 3 BLAS routines don't, but there are two good reasons why they are stored in full dense format.
BLAS does it that way
Those routines were mostly added to BLAS as support for LAPACK solvers. And those solvers typically store the results of factorizations in-situ in supplied full dense inputs, so it is logical to use that format in BLAS
I guess if you don't like the design choice you can always try writing to Jack Dongarra to complain.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.