I have a Metal-based Core Image convolution kernel that was using half
precision variables for keeping track of sums and weights. However, I now figured that the range of 16-bit half
is not enough in some cases, which means I need 32-bit float
for some variables.
Now I'm wondering what's more performant:
half
as much as possible (for the samplers and most local vars) and only convert to float
when needed (which means quite a lot, inside the loop)float
type so that no conversion is necessary.The former would mean that all arithmetic is performed in 32-bit precision, though it would only be needed for some operations.
Is there any documentation or benchmark I can run to find the cost of float
↔︎ half
conversion in Metal?
I believe you should go with option A:
use half as much as possible (for the samplers and most local vars) and only convert to float when needed (which means quite a lot, inside the loop)
based on the discussion in the WWDC 2016 talk entitled "Advanced Metal Shader Optimization" linked here .
Between around 17:17-18:58 is the relevant section for this topic. The speaker Fiona mentions a couple of things of importance:
float
) use twice as many registers, which means twice as much bandwidth, energy, etc. So using half
saves registers (which is always good) and energy For textures that have half-precision floating-point pixel color values, the conversions from half to float are lossless
so that you don't have to worry about precision being lost as well.
There are some other relevant points that are worth looking into as well in that section, but I believe this is enough to justify going with option A
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.