I have a 2D matrix containing 0,1 and 2. I am writing a cuda kernel where the number of threads is equal to the matrix size and each thread would operate on each element of the matrix. Now, I needed mathematical operations that could keep 0 and 1 as it is, but would convert 2 to 1. That is a mathematical operation, without any if-else, which would do the following conversion : 0 ->0; 1 ->1; 2 ->1. Is there any possible way using mathematical operators which would do the above mentioned conversion. Any help would be extremely appreciated. Thank you.
This is not a cuda question.
int A;
// set A to 0, 1, or 2
int a = (A + (A>>1)) & 1;
// a is now 0 if A is 0, or 1 if A is 1 or 2
or as a macro:
#define fix01(x) ((x+(x>>1))&1)
int a = fix01(A);
This also seems to work:
#define fix01(x) ((x&&1)&1)
I don't know if the use of the boolean AND operator ( &&
) fits your definition of "mathematical operations".
As the question was about "mathematical" functions I suggest the following 2nd order polynomial:
int f(int x) { return ((3-x)*x)/2; }
But if you want avoid branching in order to maximize speed: There is a min instruction since PTX ISA 1.0. (See Tab. 36 in the PTX ISA 3.1 manual.) So the following CUDA code
__global__ void test(int *x, int *y)
{
*y = *x <= 1 ? *x : 1;
}
compiles to the following PTX assembler in my test (just called nvcc from CUDA 5 without any arch options)
code for sm_10
Function : _Z4testPiS_
/*0000*/ /*0x1000c8010423c780*/ MOV R0, g [0x4];
/*0008*/ /*0xd00e000580c00780*/ GLD.U32 R1, global14 [R0];
/*0010*/ /*0x1000cc010423c780*/ MOV R0, g [0x6];
/*0018*/ /*0x30800205ac400780*/ IMIN.S32 R1, R1, c [0x1] [0x0];
/*0020*/ /*0xd00e0005a0c00781*/ GST.U32 global14 [R0], R1;
So a min() implementation using a conditional ?: actually compiles to a single IMIN.S32 PTX instruction without any branching. So I'd recommend this for any real-world applications:
int f(int x) { return x <= 1 ? x : 1; }
But back to the question of using only non-branching operations:
Another form of getting this result in C is by using two not operators:
int f(int x) { return !!x; }
Or simply compare with zero:
int f(int x) { return x != 0; }
(The results of ! and != are guaranteed to be 0 or 1, compare Sec. 6.5.3.3 Par. 5 and Sec. 6.5.9 Par. 3 of the C99 standard, ISO/IEC 9899:1999. Afair this guarantee also holds in CUDA.)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.