My problem is this: I have an array in 3D and I cannot use the malloc3D, and I need to convert and manipulate 1D array on GPU. But I don't know how do it. In this moment I am using
#define nx 8
#define ny 6
#define nz 4
to define the matriz array.. 4 matrices of 6 row with 8 columns with index i,j,k.
u[i][j][k]
and I declaration of:
cudaMalloc( (void**)&dev_u, ny * nx * nz * sizeof(float) ) ;
cudaMemcpy( dev_u, u, ny * nx * nz * sizeof(float), cudaMemcpyHostToDevice );
dim3 dimBlock(nx,ny,nz);
dim3 dimGrid(1,1);
FTCS3D<<<dimGrid, dimBlock>>>( dev_u );
cudaMemcpy( u, dev_u, ny * nx * nz * sizeof(float), cudaMemcpyDeviceToHost );
Inside the GPU:
__global__ void FTCS3D( float *u )
{
int i = threadIdx.y+blockDim.y*blockIdx.y;
int j = threadIdx.x+blockDim.x*blockIdx.x;
int k = threadIdx.z+blockDim.z*blockIdx.z;
int offset = i * nx + j + ny * nx * z;
int totid=nx*ny*nz;
if (offset < totid)
{
if ( offset ==1 )
u[offset]=5.0;
}
}
The number 5 appears in other matriz not in u[0][0][1], I do not have any idea about how to index all variables inside the offset remember I HAVE TO DO IT in this way of 1D vector.
If you have a array3D [HEIGHT][WIDTH] [DEPTH] then you could turn it into array1D [HEIGHT * WIDTH * DEPTH].
Out side your kernel you convert the 3D to 1D array
for (int x = 0, k=0; x < HEIGHT; x++)
for (int y = 0; y < WIDTH; y++)
for (int z = 0; z < DEPTH; z++)
a1D[k++] = a3D[x][y][z]
Why not only one dimension in you cuda?
__global__ void FTCS3D( float *u,int HEIGHT, int WIDTH, int DEPTH)
{
int x = threadIdx.x+blockDim.x*blockIdx.x;
int totid = HEIGHT * WIDTH * DEPTH;
if (x < totid)
{
if (x==1 )
u[x]=5.0;
}
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.