简体   繁体   English

为什么我的内核的共享内存似乎被初始化为零?

[英]Why does my kernel's shared memory seems to be initialized to zero?

As was mentioned in this Shared Memory Array Default Value question, shared memory is non-initialized, ie can contain any value.正如在这个共享内存数组默认值问题中提到的,共享内存是非初始化的,即可以包含任何值。

#include <stdio.h>

#define BLOCK_SIZE 512

__global__ void scan(float *input, float *output, int len) {
    __shared__ int data[BLOCK_SIZE];

    // DEBUG
    if (threadIdx.x == 0 && blockIdx.x == 0)
    {
        printf("Block Number: %d\n", blockIdx.x);
        for (int i = 0; i < BLOCK_SIZE; ++i)
        {
            printf("DATA[%d] = %d\n", i, data[i]);
        }
    }

}

int main(int argc, char ** argv) {
    dim3 block(BLOCK_SIZE, 1, 1);
    dim3 grid(10, 1, 1);
    scan<<<grid,block>>>(NULL, NULL, NULL);
    cudaDeviceSynchronize();
    return 0;
}

But why in this code it is not true and I'm constantly getting zeroed shared memory?但是为什么在这段代码中它不是真的,而且我不断地将共享内存归零?

DATA[0] = 0
DATA[1] = 0
DATA[2] = 0
DATA[3] = 0
DATA[4] = 0
DATA[5] = 0
DATA[6] = 0
...

I tested with Release and Debug Mode : "-O3 -arch=sm_20", "-O3 -arch=sm_30" and "-arch=sm_30".我使用发布调试模式进行了测试:“-O3 -arch=sm_20”、“-O3 -arch=sm_30”和“-arch=sm_30”。 The result is always the same.结果总是一样的。

tl;dr: shared memory is not initialized to 0 tl;dr:共享内存未初始化为 0

I think your conjecture of shared memory initialized to 0 is questionable.我认为您对共享内存初始化为0猜想是有问题的。 Try the following code, which is a slight modification of yours.试试下面的代码,这是对你的稍微修改。 Here, I'm calling the kernel twice and altering the values of the data array.在这里,我调用内核两次并更改data数组的值。 The first time the kernel is launched, the "uninitialized" values of data will be all 0 's.第一次内核启动,对“未初始化”值data将是所有0的。 The second time the kernel is launched, the "uninitialized" values of data will be all different from 0 's.内核启动第二次的“未初始化”值data将来自不同0的。

I think this depends on the fact that shared memory is SRAM, which exhibits data remanence .我认为这取决于共享内存是 SRAM,它表现出数据剩余的事实。

#include <stdio.h>

#define BLOCK_SIZE 32

__global__ void scan(float *input, float *output, int len) {

    __shared__ int data[BLOCK_SIZE];

    if (threadIdx.x == 0 && blockIdx.x == 0)
    {
        for (int i = 0; i < BLOCK_SIZE; ++i)
        {
            printf("DATA[%d] = %d\n", i, data[i]);
            data[i] = i;
        }

    }
}

int main(int argc, char ** argv) {
    dim3 block(BLOCK_SIZE, 1, 1);
    dim3 grid(10, 1, 1);
    scan<<<grid,block>>>(NULL, NULL, NULL);
    scan<<<grid,block>>>(NULL, NULL, NULL);
    cudaDeviceSynchronize();
    getchar();
    return 0;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 为什么共享内存(在ipc中)不需要上下文切换? 它是从内核空间映射到用户空间的内存吗? - Why does a shared memory (in ipc) not require context switching ? Is it a memory from kernel space that gets mapped to user space? 为什么共享内存中的链表始终导致段错误? - Why does my linked list in shared memory always lead to a segfault? 为什么我的内核没有超过共享内存限制? - How come my kernel doesn't exceed the shared memory limit? 不正确的内存访问:为什么我的内核*没有*崩溃 - Incorrect memory access: why is my kernel *not* crashing 为什么在尝试从内核模块访问共享内存时出现“未处理的错误:不精确的外部异常终止”? - Why do i get “Unhandled fault: imprecise external abort” while trying to access shared memory from my kernel module? 为什么局部变量初始化为零 - Why local variable is initialized to zero 共享内存矩阵乘法内核 - Shared memory matrix multiplication kernel 零初始化的结构不会出现在内存中 - Zero-initialized struct not appear in memory C/数字逻辑 - 为什么我的零初始化变量会改变值? - C/Digital Logic - Why are my zero-initialized variables changing value? 为什么我的数据和bss段内存使用量为零? - Why is my data and bss segment memory usage zero?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM