简体   繁体   English

具有不同旋转度的OpenGL ES 2.0点精灵-在着色器中计算矩阵?

[英]OpenGL ES 2.0 point sprites with distinct rotations - calculate matrix in shader?

I am trying to find a solution that will allow me to rotate point sprites about the z-axis with a varying attribute (ie uniform will not do). 我试图找到一种解决方案,使我可以使点精灵围绕z轴旋转,并具有不同的属性(即统一不会这样做)。

In my app I have many hundreds/thousands of point sprites being drawn per frame, which are then stored in VBOs (can quite feasibly end up being >1,000,000). 在我的应用程序中,每帧绘制了成百上千的点精灵,然后将它们存储在VBO中(最终可能> 100万)。 As such, I am looking for the best compromise between memory usage and performance. 因此,我正在寻找内存使用和性能之间的最佳折衷。

Vertex & fragment shaders current look like this: 顶点和片段着色器当前看起来像这样:

// VERTEX SHADER
attribute vec4 a_position;
attribute vec4 a_color;
attribute float a_size;
uniform mat4 u_mvpMatrix;
varying vec4 v_color;

void main()
{
    v_color = a_color;
    gl_Position = u_mvpMatrix * a_position;
    gl_PointSize = a_size;
}


// FRAGMENT SHADER
precision mediump float;
uniform sampler2D s_texture;
varying vec4 v_color;

void main()
{
    vec4 textureColor = texture2D(s_texture, gl_PointCoord);
    gl_FragColor = v_color * textureColor;
}

I can currently imagine the following possibilities: 我目前可以想象以下可能性:

  • Add a mat4 rotMatrix attribute to my point sprite data. mat4 rotMatrix属性添加到我的点精灵数据中。 Pass this to the fragment shader and rotate each fragment: 将其传递到片段着色器并旋转每个片段:

     vec2 texCoord = (rotMatrix * vec4(gl_PointCoord, 0, 1)).xy gl_FragColor = v_color * texture2D(s_texture, texCoord); 
    • Advantages: 好处:
      • Keeps shaders simple. 使着色器保持简单。
      • Simple code to compute matrices outside the shaders (using GLKit for example). 用于在着色器外部计算矩阵的简单代码(例如,使用GLKit )。
    • Disadvantages: 缺点:
      • Massively increases the size of my point sprite data (from 16 to 80 bytes/point for a 4x4 matrix; to 52 bytes/point for a 3x3 matrix... I believe it's possible to use a 3x3 rotation matrix?). 大量增加了我的点精灵数据的大小(对于4x4矩阵,从16字节/点增加到80字节/点;对于3x3矩阵,则从52字节/点增加了……我相信可以使用3x3旋转矩阵吗?)。 This could potentially cause my app to crash 3-5 times sooner! 这可能会导致我的应用程序更快崩溃3-5次!
      • Pushes a lot more computation onto the CPU (hundreds/thousands of matrix calculations per frame). 将更多计算推入CPU(每帧数百/数千矩阵计算)。


  • Add a float angle attribute to my point sprite data, then calculate the rotation matrix in the vertex shader. float angle属性添加到我的点精灵数据中,然后在顶点着色器中计算旋转矩阵。 Pass the rotation matrix to the fragment shader as above. 如上所述,将旋转矩阵传递给片段着色器。

    • Advantages: 好处:
      • Keeps point sprite data size small (from 16 to 20 bytes/point). 将点精灵数据的大小保持较小(从16到20字节/点)。
      • Pushes the heavy lifting matrix maths to the GPU. 将繁重的矩阵运算推入GPU。
    • Disadvantages: 缺点:
      • Need to write custom GLSL function to create rotation matrix. 需要编写自定义GLSL函数来创建旋转矩阵。 Not a massive problem, but my matrix maths is rusty, so this could be error prone, especially if I'm trying to figure out the 3x3 matrix solution... 这不是一个大问题,但是我的矩阵数学很生疏,所以这可能容易出错,尤其是当我试图找出3x3矩阵解决方案时...
      • Given that this must happen on hundreds/thousands of vertices, is this going to be a serious drag on performance (despite being handled by the GPU)? 鉴于这必须在成百上千的顶点上发生,这是否会严重影响性能(尽管由GPU处理)?


  • I could realistically cope with 1 byte for the angle attribute (255 different angles would be sufficient). 实际上,我可以为angle属性处理1个字节(255个不同的角度就足够了)。 Is there any way I could use some kind of lookup so that I don't need to needlessly recalculate the same rotation matrices? 有什么方法可以使用某种查找方式,而不用不必要地重新计算相同的旋转矩阵? Storing constants in the vertex shader was my first thought, but I don't want to start putting branch statements in my shaders. 将常量存储在顶点着色器中是我的第一个想法,但是我不想开始在我的着色器中放置分支语句。

Any thoughts as to a good approach? 有什么好的方法的想法吗?

The solution I went with in the end was the 2nd from the question: calculate the rotation matrix in the vertex shader. 最后我得到的解决方案是问题的第二个解决方案:在顶点着色器中计算旋转矩阵。 This has the following advantages: 这具有以下优点:

  • Keeps point sprite data size small. 将点精灵数据的大小保持较小。
  • Rotation calculations are performed by the GPU. 旋转计算由GPU执行。

The disadvantages I guessed at don't seem to apply. 我猜想的缺点似乎并不适用。 I have not noticed a performance hit, even running on a 1st gen iPad. 即使在第一代iPad上运行,我也没有注意到性能下降。 The matrix calculation in GLSL is somewhat cumbersome, but works fine. GLSL中的矩阵计算有些麻烦,但效果很好。 For the benefit of anybody else trying to do the same, here is the relevant part of the vertex shader: 为了使其他任何人都可以尝试这样做,下面是顶点着色器的相关部分:

//...
attribute float a_angle;
varying mat4 v_rotationMatrix;

void main()
{
    //...

    float cos = cos(a_angle);
    float sin = sin(a_angle);
    mat4 transInMat = mat4(1.0, 0.0, 0.0, 0.0,
                           0.0, 1.0, 0.0, 0.0,
                           0.0, 0.0, 1.0, 0.0,
                           0.5, 0.5, 0.0, 1.0);
    mat4 rotMat = mat4(cos, -sin, 0.0, 0.0,
                       sin, cos, 0.0, 0.0,
                       0.0, 0.0, 1.0, 0.0,
                       0.0, 0.0, 0.0, 1.0);
    mat4 resultMat = transInMat * rotMat;
    resultMat[3][0] = resultMat[3][0] + resultMat[0][0] * -0.5 + resultMat[1][0] * -0.5;
    resultMat[3][1] = resultMat[3][1] + resultMat[0][1] * -0.5 + resultMat[1][1] * -0.5;
    resultMat[3][2] = resultMat[3][2] + resultMat[0][2] * -0.5 + resultMat[1][2] * -0.5;
    v_rotationMatrix = resultMat;

    //...
}

Given that there is no noticeable performance hit this solution is ideal, as there is no need to create texture maps/lookups and consume additional memory, and it keeps the rest of the code clean and simple. 考虑到没有明显的性能影响,此解决方案是理想的,因为无需创建纹理贴图/查找并消耗额外的内存,并且可以使其余代码保持简洁。

I can't say that there are no downsides to calculating a matrix for every vertex (reduced battery life, for example), and performance may be a problem in different scenarios, but it's good for what I need. 我不能说为每个顶点计算矩阵都没有任何缺点(例如,减少了电池寿命),并且在不同情况下性能可能是一个问题,但这对我的需求很好。

Have you thought about using different pre-calculated and rotated textures (a texture atlas)? 您是否考虑过使用不同的预先计算和旋转的纹理(纹理图集)? If only a few angles are sufficient for the effect that you're trying to accomplish this would be a very fast solution. 如果只有几个角度足以达到您想要实现的效果,那么这将是一个非常快速的解决方案。

On a different note, there is a performance penalty for the calculation of texture coordinates within the fragment shader (indirect texture lookups). 另一方面,片段着色器(间接纹理查找)中纹理坐标的计算会降低性能。 This might not be important for your case but it's worth keeping in mind. 这对您的情况可能并不重要,但请记住。

Here is your pre-multiplied rotation matrix: 这是您的预乘旋转矩阵:

v_rotationMatrix = mat3(cos, sin, 0.0,
                        -sin, cos, 0.0,
                        (sin-cos+1.0)*0.5, (-sin-cos+1.0)*0.5, 1.0);

FWIW, this is the 3x3 pre-computed matrix I got that matches Stuart's code: FWIW,这是我得到的与Stuart的代码匹配的3x3预先计算的矩阵:

v_rotationMatrix = mat3(cos, -sin, 0.0, sin, cos, 0.0, (1.0-cos-sin)*0.5, (1.0+sin-cos)*0.5, 1.0); v_rotationMatrix = mat3(cos,-sin,0.0,sin,cos,0.0,(1.0-cos-sin)* 0.5,(1.0 + sin-cos)* 0.5,1.0);

Note that glsl matrices are in column-major format. 请注意,glsl矩阵采用主列格式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM