CUDA cudaMalloc

Question

I've started writing a new CUDA application. However I hit a funny detour along the way. Calling the first cudaMalloc on a variable x, fails the first time. However when I call it the second time it returns cudaSuccess. Recently upgraded to CUDA 4.0 SDK, it's a really weird bug.

I even did some testing and it seems the first call of cudaMalloc fails.

Answer 1

The very first call to any of the cuda library functions launches an initialisation subroutine. It can happen that somehow the initialisation fails and not the cudaMalloc itself. (CUDA Programming Guide, section 3.2.1)

Somehow, later, however it seems it works, despite the initial failure. I don't know your setting and your code so I can't really help you further. Check the Programming Guide!

Answer 2

I ~~would strongly recommend~~ using the CUDA_SAFE_CALL macro if you aren't -- to force the thread synchronisation, at least while you're debugging the code:

CUDA_SAFE_CALL(cudaMalloc((void**) &(myVar), mem_size_N ));

Update: As per @talonmies, you don't need the cutil library. So let's rewrite the solution:

/*  Allocate Data  */
cudaMalloc((void**) &(myVar), mem_size_N );

/*  Force Thread Synchronization  */
cudaError err = cudaThreadSynchronize();

/*  Check for and display Error  */
if ( cudaSuccess != err )
{
    fprintf( stderr, "Cuda error in file '%s' in line %i : %s.\n",
             __FILE__, __LINE__, cudaGetErrorString( err) );
}

And as noted in the other answer -- you may want to include the synch & check before you allocation memory just to make sure the API initialized correctly.

CUDA cudaMalloc

Question

2 answers

solution1
4 ACCPTED 2011-06-29 08:06:11

solution2
2 2011-06-29 06:15:19

CUDA cudaMalloc

Question

2 answers

solution1 4 ACCPTED 2011-06-29 08:06:11

solution2 2 2011-06-29 06:15:19

solution1
4 ACCPTED 2011-06-29 08:06:11

solution2
2 2011-06-29 06:15:19