Running zero blocks in cuda

Question

I have a loop like this:

while ( ... ) {
    ...
    kernel<<<blocks, threads>>>( ... );
}

and in some iterations blocks or threads have value 0 . When I use this my code runs. My question is if this is considered bad practice, and if there are any other bad side effects.

Answer 1

It's bad practice because it will interfere with proper CUDA error checking .

If you do proper error checking, your kernel launches that have all-zero values for block or grid dimensions will throw an error.

It's preferable to write error free programs for a variety of reasons.

Instead, include a test for these cases and skip the kernel launch when your dimensions are zero. The small overhead in C-code to do this will be more than offset by the reduced API overhead by not making the spurious kernel launch request.

Answer 2

I have tried zero block kernel invocation by simply writing following empty kernel.

File:

#include<stdio.h>

__global__ void fg()
{

} 
int main()
{   
 fg<<<0,1>>>();
}

What I noticed was the only side effect was in terms of the time required for execution.

Run time :

real 0m0.242s, user 0m0.004s, sys 0m0.148s.

When I run the same file with kernel invocation commented the side effect of overhead in time decreases.

Run time:

real 0m0.003s, user 0m0.000s, sys 0m0.000s.

This side effect arises due to the kernel invocation over head for zero blocks.

Running zero blocks in cuda

Question

2 answers

solution1
1 ACCPTED 2013-09-26 12:12:03

solution2
0 2013-09-26 11:39:59

Running zero blocks in cuda

Question

2 answers

solution1 1 ACCPTED 2013-09-26 12:12:03

solution2 0 2013-09-26 11:39:59

solution1
1 ACCPTED 2013-09-26 12:12:03

solution2
0 2013-09-26 11:39:59