简体   繁体   English

CUDA 中内核启动语句中的错误

[英]Error in Kernel launch statement in CUDA

I am doing a small project in image processing using CUDA.I am trying to use Gaussian blurring to blur an image.Everything is fine but I cannot figure out why the kernel launch statement is showing this strange error:我正在使用 CUDA 进行图像处理中的一个小项目。我正在尝试使用高斯模糊来模糊图像。一切都很好,但我无法弄清楚为什么内核启动语句会显示这个奇怪的错误:

在此处输入图片说明

Here is my complete code, if it can be of any help:这是我的完整代码,如果有帮助的话:

#include<time.h>
#include<stdlib.h>
#include<stdio.h>
#include<string.h>
#include<math.h>
#include<cuda_runtime.h>
#include<device_launch_parameters.h>
#include <helper_cuda.h>
#include <helper_cuda_gl.h>
#include<helper_image.h>
#include< helper_cuda_gl.h>
#include<helper_cuda_drvapi.h>

unsigned int width, height;

int mask[3][3] = { 1, 2, 1,
                   2, 3, 2,
                   1, 2, 1, 
                 };

int getPixel(unsigned char *arr, int col, int row)
{
int sum = 0;
for (int j = -1; j <= 1; j++)
{
    for (int i = -1; i <= 1; i++)
    {
        int color = arr[(row + j)*width + (col + i)];
        sum += color*mask[i + 1][j + 1];
    }
}
return sum / 15;
}

void h_blur(unsigned char * arr, unsigned char * result){
int offset = 2 * width;
for (int row = 2; row < height - 3; row++)
{
    for (int col = 2; col < width - 3; col++)
    {
        result[offset + col] = getPixel(arr, col, row);

    }
    offset += width;
}
}

__global__ void d_blur(unsigned char *arr, unsigned char * result, int width, int height)
{
int col = blockIdx.x*blockDim.x + threadIdx.x;
int row = blockIdx.y*blockDim.y + threadIdx.y;

if (row < 2 || col < 2 || row >= height - 3 || col >= width - 3)
    return;

int mask[3][3] = { 1, 2, 1, 2, 3, 2, 1, 2, 1 };

int sum = 0;

for (int j = -1; j <= 1; j++)
{
    int color = arr[(row + j)*width + (col + i)];
    sum += color*mask[i + 1][j + 1];
}
result[row*width + col] = sum / 15;
}

int main(int argc, char ** argv)
{
unsigned char *d_resultPixels;
unsigned char *h_resultPixels;
unsigned char *h_pixels = NULL;
unsigned char *d_pixels = NULL;

char *srcPath = "C:\ProgramData\NVIDIA Corporation\CUDA Samples\v6.5\3_Imaging\dxtc\data\lena_std.ppm";
char *h_ResultPath = "C:\ProgramData\NVIDIA Corporation\CUDA Samples\v6.5\3_Imaging\dxtc\data\lena_std.ppm";
char *d_ResultPath = "C:\ProgramData\NVIDIA Corporation\CUDA Samples\v6.5\3_Imaging\dxtc\data\lena_std.ppm";

sdkLoadPGM(srcPath, &h_pixels, &width, &height);
int ImageSize = sizeof(unsigned char) * width * height;

h_resultPixels = (unsigned char *)malloc(ImageSize);
cudaMalloc((void**)&d_pixels, ImageSize);
cudaMalloc((void**)&d_resultPixels, ImageSize);
cudaMemcpy(d_pixels, h_pixels, ImageSize, cudaMemcpyHostToDevice);

dim3 block(16, 16);
dim3 grid(width / 16, height / 16);

d_blur << < grid, block >> >(d_pixels, d_resultPixels, width, height);

cudaThreadSynchronize();
cudaMemcpy(h_resultPixels, d_resultPixels, ImageSize, cudaMemcpyDeviceToHost);
sdkSavePGM(d_ResultPath, h_resultPixels, width, height);
printf("Press enter to exit ...\n");
getchar();
}

As you are trying to run this in Visual Studio, you need to update the Intellisense.当您尝试在 Visual Studio 中运行它时,您需要更新 Intellisense。 Also,you can refer the following link for a better Image Convolution Operation in CUDA.此外,您可以参考以下链接,了解 CUDA 中更好的图像卷积操作。

2D Image Convolution in CUDA CUDA 中的 2D 图像卷积

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM