What makes cuLaunchKernel fail with CUDA_ERROR_INVALID_HANDLE?

Question

I'm launching a CUDA kernel I've compiled, using the cudLaunchKernel() driver API function. I'm passing my parameters in a kernelParams array, and passing nullptr for the extra argument.

Unfortunately, this fails, with the error: CUDA_ERROR_INVALID_HANDLE . Why? I checked the Driver API documentation to see how the function might fail in what cases, and edit it discusses the failure with CUDA_ERROR_INVALID_VALUE (not the same thing). It doesn't discuss the error I get.

Since there is more than one parameter to cuLaunchKernel() which is some sort of a handle - what does this failure mean? (And if there are multiple options - what are they?)

Answer 1

One possibility is a failure due to a CUDA driver context switch. You may have probably inadvertently performed some action which pushes or replaces the current context for the CUDA device; and loaded modules are part of context - so your compiled and loaded kernel can no longer be loaded in the current context. This triggers a CUDA_ERROR_INVALID_HANDLE failure.

Assuming this is the case, switch the context before the launch, eg this way:

cuCtxPushCurrent(my_driver_context);
cuLaunchKernel(/*etc. etc. */);
/* possibly */ cuCtxPopCurrent(NULL);

or like so:

cuCtxSetCurrent(my_driver_context);
cuLaunchKernel(/*etc. etc. */);

Note that you may be risking memory leaks, if you pop and ignore the only reference to a valid context; and you may also risk some other code assuming that the context it has put in place is still the active one.

Answer 2

Well, in my case it was an OOM error (Out of Memory) error which for some reason was not reported as such. When I reduced the batch size of my model it worked. Maybe you should check if this is the case also.

What makes cuLaunchKernel fail with CUDA_ERROR_INVALID_HANDLE?

Question

2 answers

solution1
2 ACCPTED 2020-07-07 13:10:44

solution2
0 2021-12-20 08:59:50

What makes cuLaunchKernel fail with CUDA_ERROR_INVALID_HANDLE?

Question

2 answers

solution1 2 ACCPTED 2020-07-07 13:10:44

solution2 0 2021-12-20 08:59:50

solution1
2 ACCPTED 2020-07-07 13:10:44

solution2
0 2021-12-20 08:59:50