简体   繁体   English

具有CUDA设备功能的功能指针

[英]Function Pointers With CUDA Device Functions

I would like use function pointers in my Cuda C++ code, like below, 我想在Cuda C ++代码中使用函数指针,如下所示,

typedef __device__ void customFunc(const char*, uint64_t, char*, const uint64_t);

which is what I'm after. 这就是我所追求的。 Its equivalent without "__device__" does work perfectly well. 等同于不带“ __device__”的设备,效果很好。

Are function pointers supported in Cuda? Cuda是否支持函数指针?

Edit: 编辑:

I'm specifically interested in how to use __device__ functions as functions pointers to __device__ functions 我对如何使用__device__函数作为指向__device__函数的函数的指针特别感兴趣

There is no magic involved in using device function pointers in device code. 在设备代码中使用设备功能指针没有任何魔术。 It is functionally and syntactically identical to standard C++. 它在功能和语法上均与标准C ++相同。

For example: 例如:

#include <cstdio>

typedef int (*ufunc)(int args);

__device__ int f1(int x)
{
    int res = 2*x;
    printf("f1 arg = %d, res = %d\n", x, res);
    return res;
}

__device__ int f2(int x, int y, ufunc op)
{
    int res = x + op(y);
    printf("f2 arg = %d, %d, res = %d\n", x, y, res);
    return res;
}


__global__ void kernel(int *z) 
{

    int x = threadIdx.x;
    int y = blockIdx.x;
    int tid = threadIdx.x + blockDim.x * blockIdx.x;

    z[tid] = f2(x, y, &f1);
}

int main()
{
    const int nt = 4, nb = 4;
    int* a_d;
    cudaMalloc(&a_d, sizeof(float) * nt *nb);

    kernel<<<nb, nt>>>(a_d);
    cudaDeviceSynchronize();
    cudaDeviceReset();

    return 0;
}
#include <cstdio>

typedef int (*bfunc)(int args);

__device__ int f1(int x)
{
    int res = 2*x;
    printf("f1 arg = %d, res = %d\n", x, res);
    return res;
}

__device__ int f2(int x, int y, bfunc op)
{
    int res = x + f1(y);
    printf("f2 arg = %d, %d, res = %d\n", x, y, res);
    return res;
}


__global__ void kernel(int *z) 
{

    int x = threadIdx.x;
    int y = blockIdx.x;
    int tid = threadIdx.x + blockDim.x * blockIdx.x;

    z[tid] = f2(x, y, &f1);
}

int main()
{
    const int nt = 4, nb = 4;
    int* a_d;
    cudaMalloc(&a_d, sizeof(float) * nt *nb);

    kernel<<<nb, nt>>>(a_d);
    cudaDeviceSynchronize();
    cudaDeviceReset();

    return 0;
}

Here, we define a simple pointer to a unary functor as a type, and then a device function which takes that type as an argument. 在这里,我们定义了指向一元函子的简单指针作为类型,然后定义了将该类型作为参数的设备函数。 The static assignment of the function pointer within the kernel call is handled at compile time and everything works. 内核调用中函数指针的静态分配在编译时处理,并且一切正常。 If you want to have function pointer selection happen at run time, then you need to follow the instructions given in the link you were already provided with. 如果要在运行时选择函数指针,则需要遵循已提供的链接中给出的说明。

The important thing to keep in mind here is that in CUDA it is not legal to include CUDA specifiers ( __device__ , __constant__ , __global__ , etc) in type definitions. 这里要记住的重要一点是,在CUDA中,在类型定义中包含CUDA说明符( __device____constant__ __global__ __device____constant__ __global__等)是不合法的。 Each variable instance has a specifier as part of its definition. 每个变量实例在其定义中都有一个说明符。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM