简体   繁体   English

OpenGL ARB 着色器 baloot readInvocationARB 为非活动调用返回不同的结果

[英]OpenGL ARB shader baloot readInvocationARB returns different results for inactive invocation

I am writing a shader that uses the same algorithm several times on different inputs to calculate some intermediate results that are later combined to get the final result.我正在编写一个着色器,它在不同的输入上多次使用相同的算法来计算一些中间结果,这些中间结果稍后被组合以获得最终结果。 The data is dependent on the in-game world position, and for invocations in one group it is close enough, so I can process all inputs on different invocations simultaneously and them move intermediate results between invocations to calculate final results.数据取决于游戏中的世界位置,对于一组中的调用,它足够接近,因此我可以同时处理不同调用的所有输入,并在调用之间移动中间结果以计算最终结果。 The tricky part is that some intermediate results allow earlier exiting from the calculations, and that means that some invocations depend on other's calculations, and some of them may not (so they are inactive when the results are moved between invocations).棘手的部分是一些中间结果允许更早地退出计算,这意味着一些调用依赖于其他的计算,而其中一些可能不依赖(因此当结果在调用之间移动时它们是不活动的)。

OpenGL has a ARB shader baloot extension, that allows reading values from other invocatoins. OpenGL 有一个ARB 着色器 baloot扩展,允许从其他调用中读取值。 Specifically, readInvocationARB(genUType value, uint invocationIndex) function.具体来说, readInvocationARB(genUType value, uint invocationIndex)函数。 Its description says that:它的描述说:

The function readInvocationARB() returns the <value> from a given <invocationIndex> to all active invocations in the sub-group.函数 readInvocationARB() 将给定 <invocationIndex> 中的 <value> 返回到子组中的所有活动调用。 The <invocationIndex> must be the same for all active invocations in the sub-group otherwise results are undefined.对于子组中的所有活动调用,<invocationIndex> 必须相同,否则结果未定义。

And does not explicitly state that <value> depends on whether read-from invocation is active or not.并且没有明确声明 <value> 取决于读取调用是否处于活动状态。 I decided to test what would the result be if the invocation is inactive.我决定测试如果调用不活动会产生什么结果。

#version 460
#extension GL_ARB_shader_ballot : enable

uniform uint data;
layout(location = 0) out vec4 color;

void main() { 
    uint var;
    var = data;
    
    if(gl_SubGroupInvocationARB != 20) {
        const uint result = readInvocationARB(var, 20);
        color = vec4(result == data);
    }
    else color = vec4(1, 0, 0, 1);
}

The result was white image with red dots, which means that in this case the <value> from inactive 20'th invocation is what had been written in the variable earlier.结果是带有红点的白色图像,这意味着在这种情况下,来自非活动第 20 次调用的 <value> 是之前在变量中写入的内容。

Then I decided to add condition to var 's value:然后我决定为var的值添加条件:

#version 460
#extension GL_ARB_shader_ballot : enable

uniform uint threshold;
uniform uint data;

layout(location = 0) out vec4 color;

void main() { 
    uint var;
    if(gl_SubGroupInvocationARB < threshold)
         var = data;
    else var = 10;
    
    if(gl_SubGroupInvocationARB != 20) {
        const uint result = readInvocationARB(var, 20);
        color = vec4(result == data);
    }
    else color = vec4(1, 0, 0, 1);
    
    return; 
}

data uniform was set to 10 (so that value would actually always be the same), and the value of the threshold variable doesn't matter (the results are always the same regardless). data uniform 设置为 10(因此该值实际上始终相同),并且阈值变量的值无关紧要(无论如何结果始终相同)。 In the second case the image was black with red dots, which means that <value> is not what has been written the variable earlier (it is actually 0).在第二种情况下,图像是带有红点的黑色图像,这意味着 <value>不是之前写入变量的内容(实际上是 0)。

I decided to test these two shaders on another computer, which had AMD graphics card (first one had Nvidia).我决定在另一台装有 AMD 显卡(第一台有 Nvidia)的计算机上测试这两个着色器。 In both cases the output image there was white with red dots, which differs from what the first PC displayed.在这两种情况下,输出图像都是带有红点的白色,这与第一台 PC 显示的不同。 It is not a rigorous testing, but it shows that different configurations of hardware and software produce different results for the same shader programs.这不是一个严格的测试,但它表明硬件和软件的不同配置对于相同的着色器程序会产生不同的结果。

So the question is:所以问题是:

  • Am I missing something and the result of readInvocationARB() for inactive invocation is undefined, in which case where can I find complete description of this extension?我是否遗漏了某些内容,并且未定义 readInvocationARB() 的非活动调用结果,在这种情况下,我在哪里可以找到此扩展的完整描述? or,或者,
  • the result is defined and I do some other thing wrong or one of the outputs produced is incorrect.结果已定义,但我做错了其他事情或产生的输出之一不正确。

As the quoted part of the specification states, invocationIndex must be an active invocation within the subgroup.正如规范中引用的部分所述, invocationIndex必须是子组内的活动调用。 As such, if it isn't an active invocation in the subgroup, you have passed the function faulty data and thus achieve undefined behavior.因此,如果它不是子组中的活动调用,则您已向函数传递了错误数据,从而实现了未定义的行为。

Because of the branch, which invocations are part of the same subgroup is not known.由于分支的原因,不知道哪些调用属于同一子组。 That is, you did nothing to ensure that 20 was an active invocation index in the same subgroup.也就是说,您没有采取任何措施来确保 20 是同一子组中的活动调用索引。 You didn't check to see if ballotARB(true) & (0x1 << 20) is true.您没有检查ballotARB(true) & (0x1 << 20)是否为真。

Since branch divergence is implementation-dependent, and you do nothing to verify that 20 is a valid active invocation index, whether your code has well-defined behavior depends on implementation-dependent stuff.由于分支分歧是依赖于实现的,并且你没有做任何事情来验证 20 是一个有效的活动调用索引,你的代码是否具有明确定义的行为取决于依赖于实现的东西。 And therefore, it can vary from implementation to implementation.因此,它可能因实施而异。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM