如何在Visual Studio 2013中启用CUDA 7.0+每线程默认流？

Question

我遵循了GPU Pro技巧：CUDA 7流简化并发中提供的方法，并在带有CUDA 7.5的VS2013中对其进行了测试。 尽管多流示例工作正常，但多线程示例并未给出预期的结果。 代码如下：

#include <pthread.h>
#include <cstdio>
#include <cmath>

#define CUDA_API_PER_THREAD_DEFAULT_STREAM

#include "cuda.h"

const int N = 1 << 20;

__global__ void kernel(float *x, int n)
{
    int tid = threadIdx.x + blockIdx.x * blockDim.x;
    for (int i = tid; i < n; i += blockDim.x * gridDim.x) {
        x[i] = sqrt(pow(3.14159, i));
    }
}

void *launch_kernel(void *dummy)
{
    float *data;
    cudaMalloc(&data, N * sizeof(float));

    kernel << <1, 64 >> >(data, N);

    cudaStreamSynchronize(0);

    return NULL;
}

int main()
{
    const int num_threads = 8;

    pthread_t threads[num_threads];

    for (int i = 0; i < num_threads; i++) {
        if (pthread_create(&threads[i], NULL, launch_kernel, 0)) {
            fprintf(stderr, "Error creating threadn");
            return 1;
        }
    }

    for (int i = 0; i < num_threads; i++) {
        if (pthread_join(threads[i], NULL)) {
            fprintf(stderr, "Error joining threadn");
            return 2;
        }
    }

    cudaDeviceReset();

    return 0;
}

我还尝试将宏CUDA_API_PER_THREAD_DEFAULT_STREAM添加到CUDA C / C ++-> Host-> Preprocessor Definitions，但是结果是相同的。 探查器生成的时间线如下：

您对这里发生的事情有任何想法吗？ 提前谢谢了。

Answer 1

您发布的代码符合我的预期：

编译并在具有CUDA 7.0的Linux系统上运行时，如下所示：

$ nvcc -arch=sm_30  --default-stream per-thread -o thread.out thread.cu

因此，我只能假设您遇到平台特定的问题，或者您的构建方法不正确（请注意，必须为构建中的每个翻译单元指定--default-stream per-thread ）。

Answer 2

更新：当我添加“ cudaFree”时，并发可能会发生，如下所示。 是否因为缺乏同步？

void *launch_kernel(void *dummy)
{
    float *data;
    cudaMalloc(&data, N * sizeof(float));

    kernel << <1, 64 >> >(data, N);
    cudaFree(data); // Concurrency may happen when I add this line
    cudaStreamSynchronize(0);

    return NULL;
}

像这样的编译：

nvcc -arch=sm_30  --default-stream per-thread -lpthreadVC2 kernel.cu -o kernel.exe

如何在Visual Studio 2013中启用CUDA 7.0+每线程默认流？

问题描述

2 个解决方案

解决方案1
1

解决方案2
1 已采纳 2016-10-06 08:10:54

如何在Visual Studio 2013中启用CUDA 7.0+每线程默认流？

问题描述

2 个解决方案

解决方案1 1

解决方案2 1 已采纳 2016-10-06 08:10:54

解决方案1
1

解决方案2
1 已采纳 2016-10-06 08:10:54