简体   繁体   English

金属计算着色器threadgroup和threadExecutionWidth

[英]Metal compute shaders threadgroup & threadExecutionWidth

Can someone explain in simple terms what threadgroup conceptually is in Metal compute shaders and other terms such as SIMD group, threadExecutionWidth (wavefront)? 有人可以用简单的术语来解释“金属计算”着色器和其他术语(例如SIMD组,threadExecutionWidth(波前))在概念上是什么线程组吗? I read the docs but am more confused. 我阅读了文档,但更加困惑。 For instance, if I have a 1024x1024 image, how many threadgroups can I have, how can I map thread to each pixel, how many can run concurrently, etc.? 例如,如果我有一个1024x1024的图像,那么我可以拥有几个线程组,如何将线程映射到每个像素,可以同时运行几个,等等? I can't find WWDC video describing compute shaders and these concepts. 我找不到描述计算着色器和这些概念的WWDC视频。

A threadgroup is a group of threads that work together to solve a certain (sub)problem. 线程组是一组线程,它们共同解决某个(子)问题。 You can have a maximum of 512 or 1024 threads in a threadgroup (depending on the device you're using). 一个线程组中最多可以有5121024线程(取决于您使用的设备)。

The threadExecutionWidth is the size of the SIMD groups used. threadExecutionWidth是所使用的SIMD组的大小。 It's typically 32 , meaning each SIMD group has 32 threads in it. 通常为32 ,这意味着每个SIMD组中都有32线程。 For optimal performance, the number of threads in your threadgroup should be a multiple of threadExecutionWidth . 为了获得最佳性能,线程组中的线程数应为threadExecutionWidth (This is indeed what others call the wavefront or warp.) (这确实是其他人所称的波前或翘曲。)

If you have a 1024x1024 image and you want one thread to process one pixel, and the maximum threadgroup size is 512 , then you can create a grid of 1024x1024 threads that consists of 32x64 threadgroups of size 32x16 (ie 512 ). 如果你有一个1024x1024图像,你想一个线程来处理一个像素,最大线程组大小为512 ,那么你就可以创建一个网格1024x1024线程由32x64大小的线程组32x16 (即512 )。

But really, you can divide up the threads however you want. 但是实际上,您可以根据需要划分线程。 You could also have a grid of 2x1024 threadgroups of size 512x1 , or whatever. 您还可以拥有2x1024线程组的网格,大小为512x1或其他。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM