What is meaning of following syntax:
Kernel_fun<<<256, 128, 2056>>>(arg1, arg2, arg3);
Which value indicates workgroup and which value indicates thread.
From the CUDA Programming Guide , appendix B.22 (as of May 2019):
The execution configuration is specified by inserting an expression of the form <<< Dg, Db, Ns, S >>> between the function name and the parenthesized argument list, where:
Dg is of type dim3 (see Section B.3.2) and specifies the dimension and size of the grid, such that Dg.x * Dg.y * Dg.z equals the number of blocks being launched; Dg.z must be equal to 1 for devices of compute capability 1.x;
Db is of type dim3 (see Section B.3.2) and specifies the dimension and size of each block, such that Db.x * Db.y * Db.z equals the number of threads per block;
Ns is of type size_t and specifies the number of bytes in shared memory that is dynamically allocated per block for this call in addition to the statically allocated memory; this dynamically allocated memory is used by any of the variables declared as an external array as mentioned in Section B.2.3; Ns is an optional argument which defaults to 0;
S is of type cudaStream_t and specifies the associated stream; S is an optional argument which defaults to 0.
In short: <<< number of blocks, number of threads, dynamic memory per block, associated stream >>>
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.