简体   繁体   中英

How to enable separate compilation for CUDA project in Visual Studio

I am new to CUDA. I am trying to write an application where I am calling one kernel function from another kernel function. But I am getting an error " kernel launch from device or global functions requires separate compilation mode " while building the application. Here is my complete code. Any help would be appreciated.

#include<iostream>
#include<curand.h>
#include<cuda.h>
#include <curand_kernel.h>
#include <stdlib.h>
#include <stdio.h>
using namespace std;

__device__ int *vectorData;
__device__ void initializeArray(int elementCount)
{
    for (int i = 0; i < elementCount; i++)
    {
        vectorData[i] = 1;
    }
}
__global__ void AddOneToEachElement(int elementCount)
{
    for (int i = 0; i < elementCount; i++)
    {
        vectorData[i] = vectorData[i]+1;
    }
}
__global__ void addKernel(int *numberOfElements)
{
    vectorData = (int*)malloc(sizeof(int));
    initializeArray(*numberOfElements);
    int gridSize = ceil((*numberOfElements) / 1024) + 1;
    AddOneToEachElement << <gridSize, 1024 >> > (*numberOfElements);
    cudaDeviceSynchronize();
    free(vectorData);
}

int main()
{
    int numberOfElements = 1;
    int *device_numberOfElements;
    cudaMalloc((int**)&device_numberOfElements, sizeof(int));
    cout << "Enter the Number of elements" << endl;
    cin >> numberOfElements;
    cudaMemcpy(device_numberOfElements, &(numberOfElements), sizeof(int), cudaMemcpyHostToDevice);
    addKernel << <1, 1 >> > (device_numberOfElements);
    cudaFree(device_numberOfElements);
    return 0;
}

The issue resolved using the information available on the following link Using CUDA dynamic parallelism in Visual Studio

Here is the complete information that I obtained from the above mentioned link:

Starting from CUDA 5.0, CUDA enables the use of dynamic parallelism for GPUs with compute capability 3.5 or higher. Dynamic parallelism allows launching kernels directly from other kernels and enables further speedups in those applications which can benefit of a better handling of the computing workloads at runtime directly on the GPU; in many cases, dynamic parallelism avoids CPU/GPU interactions with benefits to mechanisms like recursion. To use dynamic parallelism in Visual Studio 2010 or Visual Studio 2013, do the following:

  • View -> Property Pages
  • Configuration Properties -> CUDA C/C++ -> Common -> Generate Relocatable Device Code -> Yes (-rdc=true)
  • Configuration Properties -> CUDA C/C++ -> Device -> Code Generation -> compute_35,sm_35
  • Configuration Properties -> Linker -> Input -> Additional Dependencies -> cudadevrt.lib

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM