简体   繁体   中英

Is there a way to use OpenCL C mad function in Vulkan SPIR-V?

As we know, there's at least 2 ways to calculate a * b + c :

  1. ret := a*b; ret := ret + c;

  2. ret := fma(a, b, c);

But in OpenCL C, there's a third function called "mad" that trades precision for performance.

In the LunarG sdk, the default SPIR-V compiler compiles the GLSL and HLSL shading languages and the "mad" function is not mentioned in GLSL spec v4.60.

How do I use "mad" function in Vulkan?

There's a bit of misunderstanding here.

Fused multiply add does not mean less precision. What it may mean is a slightly different number than if you applied a multiply then add vs fma, because of the internal hardware precision differences kept between steps of the operation. For this reason in some API's/languages automatic FMA isn't enabled by default, and only comes when fast-math or specific flag is used in your compiler. There may be systems where it results in poorer precision, but that isn't what it implies.

In SPIR-V however, while there doesn't appear to be a specific instruction for FMA, the spec explicitly predicts and allows for it post SPIR-V -> gpu assembly compilation. It even has a NoContraction decoration in the language.

NoContraction Apply to an arithmetic instruction to indicate the operation cannot be combined with another instruction to form a single operation. For example, if applied to an OpFMul, that multiply can't be combined with an addition to yield a fused multiply-add operation. Furthermore, such operations are not allowed to reassociate; eg, add(a + add(b+c)) cannot be transformed to add(add(a+b) + c).

Note that SPIR-V is not the end all be all of your shader. It is only a portable intermediary representation of your shader, which is then further compiled by your vendors vulkan drivers. No machine runs SPIR-V directly. These kinds of optimizations are left to the driver to perform, rather than to the programmer. You can generally assume that such an optimization will occur under the appropriate conditions, this is the same for other programming languages that lack a explicit FMA builtin.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM