简体   繁体   English

如何在CUDA中实现与子矩阵的接口?

[英]How to implement an interface to a sub-matrix in CUDA?

I have a wrapper class CudaMatrix that implements several cuBLAS operations, allowing me to call m1.multiply(m2) that runs the sgemm operation on the internal data pointers. 我有一个包装器类CudaMatrix ,它实现了多个cuBLAS操作,使我可以调用m1.multiply(m2) ,该sgemm m1.multiply(m2)在内部数据指针上运行sgemm操作。

I would like to extend the class by operations on sub-matrices, something like 我想通过对子矩阵的操作来扩展类,例如

CudaMatrix a(100,100);
CudaMatrix b(100,100);
// fill a and b

int i=5, j=15;
CudaSubMatrix sa(a, i, j, i+10, j+10); // sa := a[5:15, 15:25]

i=50, j=60;
CudaSubMatrix sb(b, i, j, i+10, j+10); // sb := b[50:60, 60:70]    

CudaMatrix res;
res.copy(sa);
res.multiply(sb)  // res = sa*sb

In the last row, multiply() needs to operate on a sub-matrix sb , so the rows are not contiguous and I can't call the same sgemm operations as before. 在最后一行中, multiply()需要在子矩阵sb上进行操作,因此这些行不是连续的,并且我无法调用与之前相同的sgemm操作。

How do I implement an efficient interface to sub-matrices that avoids copying data explicitly? 如何为子矩阵实现有效的接口,从而避免显式复制数据? Are there any open-source implementations that I can look for? 我可以寻找任何开源实现吗?

The sub-matrices multiply may be performed using the ldx parameter of the API calls. 可以使用API​​调用的ldx参数执行子矩阵乘法。

Indexing is described at the 1.1 DataLayout section: 索引在1.1 DataLayout部分中介绍:

#define IDX2C(i,j,ld) (((j)*(ld))+(i)) #定义IDX2C(i,j,ld)(((j)*(ld))+(i))

Then use the cublasSgemm for example with lda parameter equal to the number of lines 然后使用cublasSgemm ,例如lda参数等于行数

the cuBLAS library uses column-major storage cuBLAS库使用列主存储

of the original matrix, and m , n , k for the sub-matrices. 表示原始矩阵, mnk表示子矩阵。

Note indexing might differ in fortran for C indexing scheme. 注意,在fortran for C索引方案中,索引编制可能有所不同。

Hence what you really need is the size of your sub-matrix (col,rows), and the size of a column in the input matrix (its number of lines). 因此,您真正需要的是子矩阵的大小(col,行)以及输入矩阵中的列的大小(其行数)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM