简体   繁体   English

OpenCL将MxN矩阵转换为方矩阵

[英]OpenCL convert MxN matrix to square matrix

I'm trying to convert a 3x2 matrix into a square matrix which is 4x4: 我正在尝试将3x2矩阵转换为4x4的方阵:

__kernel void padding(float* newM, int m, int n, int newlength)
{

}

The matrix "newM" is in row-major-order, m=3, n=2 and newlength=4. 矩阵“ newM”按行顺序排列,m = 3,n = 2,newlength = 4。 The elements in newM are all compact to the front and the tail of the matrix is just 0's. newM中的元素都是紧凑的,矩阵的尾部只有0。 My confusion is how can i shift the elements along without losing the subsequent values. 我的困惑是如何在不丢失后续值的情况下转移元素。 I would create a local copy, but the matrices that i am dealing with are extremely large and do not fit into private memory. 我会创建一个本地副本,但是我要处理的矩阵非常大,无法放入私有内存中。

Here's a 1 dimensional look: 这是一维外观:

[1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0] -> [1,1,1,0,1,1,1,0,0,0,0,0,0,0,0,0]

Heres a 2 dimensional look: 这是二维外观:

[1, 1, 1]    [1, 1, 1, 0]
[1, 1, 1] -> [1, 1, 1, 0]
             [0, 0, 0, 0]
             [0, 0, 0, 0]

How it actually looks in 2D: 它在2D中的实际外观:

[1, 1, 1, 1]    [1, 1, 1, 0]
[1, 1, 0, 0] -> [1, 1, 1, 0]
[0, 0, 0, 0]    [0, 0, 0, 0]
[0, 0, 0, 0]    [0, 0, 0, 0]

All numbers I have used here are just for this examples, in reality I have random floats in the matrices and dimensions are beyond 2000x2000. 我在这里使用的所有数字仅用于此示例,实际上我在矩阵中具有随机浮点数,并且尺寸超过2000x2000。

Any Ideas? 有任何想法吗? Thanks 谢谢

Do this, if your data is ordered row-wise: 如果您的数据按行排序,请执行以下操作:

__kernel void padding(float* newMa, float* oldMa, int oldR, int oldC, int N)
{
    int id = get_global_id(0);
    int r = id/N;
    int c = id%N;
    float value = 0.0f;
    if(r < oldR || c < oldC) //Inside the old matrix size
        value = oldMa[r*oldR+oldC];
    newMa[id] = value ;
}

The new matrix size should hold enough space for the operation, that is "NxN". 新的矩阵大小应为操作保留足够的空间,即“ NxN”。

I don't know if you are using this memory ordering. 我不知道您是否正在使用此内存顺序。 Could you provide how you expect the data to interface with your other kernels? 您能否提供期望数据与其他内核交互的方式? As other answer says, you provably don't need another kernel for such an easy operation. 就像其他答案所说的那样,事实证明您不需要其他内核即可进行如此简单的操作。 You can also integrate this inside your other kernel. 您也可以将其集成到其他内核中。

If you don't need to do any math, and the only target is to interpret data in other way, you don't need any OpenCL here. 如果您不需要进行任何数学运算,并且唯一的目标就是以其他方式解释数据,那么这里就不需要任何OpenCL。

Reallocate memory and introduce new matrix row. 重新分配内存并引入新的矩阵行。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM