简体   繁体   中英

Cuda thread linear indexing

I need to linearly index my threads in such a way that I'm sure that the first 32 of them belong to the same warp, ie, that the linear index follows how warps are internally created. In other words, are the linear index used to create warps c like or fortran like. To explain that, consider a block of threads of size 2x5. I can create a linear index that follows the fortran or the c convention:

0, 1, 2, 3, 4
5, 6, 7, 8, 9

vs.

0, 2, 4, 6, 8
1, 3, 5, 7, 9

For a large array, I want to be sure that my first 32 threads are all in the same warp. How is the correct way to generate the linear index?

Your threads are scheduled in groups of 32. Threads 0 to 31 fall in the first warp, 32-63 on the second and so on. If threads%32 != 0 , the last warp is filled with "shadow" threads (so you have to create a mechanism to prevent this threads to access wrong memory positions, usually an if statement).

You are not able to change this ordering, so the first 32 threads will be always in the same warp. Despite of being in the first warp, this does not guarantee that this warp will be executed in first place, the SM schedule the warps at its convenience.

From this answer:

Threads are grouped into warps in the order of x, y, z. So a 16x16 threadblock will have threads in the following order in the first 32-thread warp:

warp: thread ID (x,y,z)

0: 0,0,0
1: 1,0,0
2: 2,0,0
3: 3,0,0
...
15: 15,0,0
16: 0,1,0
17: 1,1,0
18: 2,1,0
19: 3,1,0
...
31: 15,1,0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM