简体   繁体   English

CUDA内核可以修改主机Memory吗?

[英]Can CUDA Kernels Modify Host Memory?

Is there any way to get a kernel to modify an integer via passing a pointer to that integer to the kernel?有没有办法让 kernel 修改 integer 通过将指向 integer 的指针传递给 Z530484C18F21AFDA9? It seems the pointer is pointing to an address in device memory, so the kernel does not affect the host.似乎指针指向设备 memory 中的地址,因此 kernel 不会影响主机。

Here's a simplified example with the behavior I've noticed.这是我注意到的行为的简化示例。

#include "cuda_runtime.h"
#include "device_launch_parameters.h"

#include <iostream>

__global__
void change_cuda(int* c);

void change_var(int* c);

int main() {
    using namespace std; 

    int c = 0;
    int* ptc = &c;

    change_var(ptc); // *ptc = 123

    cout << c << endl;

    cudaError_t errors;

    cudaMallocManaged((void**)&ptc, sizeof(int));

    change_cuda<<<1, 1>>> (ptc); // *ptc = 555

    errors = cudaDeviceSynchronize();

    cudaFree(ptc);

    cout << cudaGetErrorString(errors) << endl;
    cout << c << endl;

    return 0;
}

__global__
void change_cuda(int* c) {
    *c = 555;
}

void change_var(int* c) {
    *c = 123;
}

Ideally, this would modify c to be 555 at the end, but the output of this example is理想情况下,这会将c最后修改为 555,但此示例的 output 是

123
no error
123

Clearly I am misunderstanding how this works.显然我误解了这是如何工作的。 What is the correct way to get the behavior that I expect?获得我期望的行为的正确方法是什么?

Yes, you have a misunderstanding.是的,你有一个误解。 cudaMallocManaged is an allocator like, for example, malloc or new . cudaMallocManaged是一个分配器,例如mallocnew It returns a pointer that points to a new allocation , of the size requested.它返回一个指向请求大小的新分配的指针。

It is not some method to allow your host stack based variable to be accessed from device code.这不是允许从设备代码访问基于主机堆栈的变量的某种方法。

However, the allocated area pointed to by the pointer returned by cudaMallocManaged can be accessed either from device code or host code.但是, cudaMallocManaged返回的指针指向的分配区域可以从设备代码或主机代码访问。 (It will not point to your c variable.) (它不会指向您的c变量。)

You can minimally fix your code by making the following changes.您可以通过进行以下更改来最低限度地修复您的代码。 1. comment out the call to cudaFree . 1. 注释掉对cudaFree的调用。 2. print out the value of *ptc rather than c . 2. 打印出*ptc的值而不是c Perhaps a more sensible change might be like this:也许更明智的变化可能是这样的:

int main() {
    using namespace std; 

    int* ptc;

    cudaMallocManaged((void**)&ptc, sizeof(int));

    change_var(ptc); // *ptc = 123

    cout << *ptc << endl;

    cudaError_t errors;

    change_cuda<<<1, 1>>> (ptc); // *ptc = 555

    errors = cudaDeviceSynchronize();

    cout << cudaGetErrorString(errors) << endl;
    cout << *ptc << endl;

    return 0;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM