[英]configuration parameters of cuda kernel
I have to add two square matrices of N x N
using cuda program . 我必须使用cuda程序添加两个N x N
平方矩阵。 The book asks to write the configuration parameters for the kernel for the cases : 本书要求为案例编写内核的配置参数:
(a) Each thread must process only 1
matrix element (a)每个线程必须只处理1
矩阵元素
(b) Each thread producing one output matrix row (b)每个线程产生一个输出矩阵行
(c) Each thread producing one output matrix column (c)每个线程产生一个输出矩阵列
My solutions for the above : 我对上述方案的解决方案:
(a) (一种)
dim3 threadPerBlocks(1,1,1);
dim3 numBlocks(N,N,1);
(b) (b)中
dim3 threadPerBlocks(N,1,1);
dim3 numBlocks(1,N,1);
(c) (C)
dim3 threadPerBlocks(1,N,1);
dim3 numBlocks(N,1,1);
I have no idea whether I am right or wrong for parts (b) and (c) . 我不知道(b)和(c)部分我是对还是错。 Please tell me about those and give a brief explanation about them ( if they are wrong , please correct me and explain ) . 请告诉我这些并给出一个简短的解释(如果他们错了,请纠正我并解释)。
(a) is somewhat fine but you can write in different ways.. All its required is you need to have N x N
threads so each processes one element. (a)有点好,但你可以用不同的方式编写。所有需要的是你需要有N x N
线程,所以每个处理一个元素。
Alternative for (a) is (a)的替代方案是
dim3 threadPerBlocks(N,1,1);
dim3 numBlocks(N,1,1);
And in kernel you process as 在内核中你处理为
id = blockIdx.x * blockDim.x + threadIdx.x ;
array[id] = ... ; // process one element.
But for (b) it says you need to each thread producing one out matrix row so you need only N
or number of columns
number of threads. 但是对于(b)它说你需要每个线程产生一个矩阵行,所以你只需要N
或number of columns
数的线程数。 What you have written with that you will still end up with N x N
threads. 你写的是你仍然会得到N x N
线程。
So you can write this way. 所以你可以这样写。 One of the possible way there are other ways too. 其中一种可能的方式也有其他方式。
dim3 threadPerBlocks(N,1,1);
dim3 numBlocks(1,1,1);
idx = threadIdx.x ;
Then you use a for loop
to process 1 row in each thread. 然后使用for loop
在每个线程中处理1行。
for (i = 0 ; i < N ; i++)
{
index = idx * N + i ;
array [index] = ..... ;
}
Similarly you can think for (c) case. 同样,你可以考虑(c)案例。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.