简体   繁体   English

如何通过 cuda 中的索引将数组元素设置为零?

[英]How to set element of array to zero by index in cuda?

I am trying with cuda to set some elements in array by index to zero.我正在尝试使用 cuda 将数组中的某些元素按索引设置为零。 My array size has about 7,000,000 elements.我的数组大小有大约 7,000,000 个元素。 The index length is about 1,000.索引长度约为 1,000。 So I want to write the kernel code efficiently.所以我想高效地编写 kernel 代码。 The only technique I know is to set the block size by cudaOccupancyMaxPotentialBlockSize .我知道的唯一技术是通过cudaOccupancyMaxPotentialBlockSize设置块大小。 Could any one give me some suggestion to speed up?有人可以给我一些加快速度的建议吗?

eg The pointer of the array a is double *a , with size n .例如,数组 a 的指针是double *a ,大小n The index's pointer is int * index , with length n1 .索引的指针是int * index ,长度为n1

__global__ void setZero(int n, double * a,int n1, const int* index)
{
  int i = threadIdx.x + blockIdx.x * blockDim.x;
  if (i<n)
  {
    for(int ii=0; ii<n1; ii++) 
      if(i==index[ii]-1)
        a[i] = 0;
  }
}

void main() 
{
    int blockSize;      
    int minGridSize;    
    int gridSize; 
    cudaOccupancyMaxPotentialBlockSize(&minGridSize, &blockSize, setZero, 0, n); 
    gridSize = (n + blockSize - 1) / blockSize;
    setZero<<<gridSize, blockSize>>>(n, d_a, n1, d_index);
}

As a mini sample, a = {1,2,3,4,5}, index = [2,4] .作为一个小样本, a = {1,2,3,4,5}, index = [2,4] The output is a = {1,0,3,0,5} . output 是a = {1,0,3,0,5}

Given your constrains I think the following would already be good enough:鉴于您的限制,我认为以下内容已经足够好了:

__global__ void setZero(int n, double *a, int n1, const int* index, const int* index_size)
{
  int id = threadIdx.x + blockIdx.x * blockDim.x;
  if (id < index_size)
     a[index[id]]=0
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM