简体   繁体   中英

Texture coordinates and optimizing GLSL shaders

I'm debating the pros and cons of passing texture coordinates to a GLSL shader in various ways.

I'm rendering a lot of instance data. I have one basic model, and then I pass a Transformation Matrix and a Texture/Sprite Index to my shader. Each model is then rotated and translated as per the transformation matrix, and the texture is decided as per this snippet:

TexCoord0 = vec2(TexCoord.x+(TexIndex%16),TexCoord.y+(TexIndex/16))/16;

The thing I don't like about this is that I've hard-coded the sprite and texture size. I could use uniforms to pass this information along, but then I still have the limitation that my sprite can't vary from instance to instance (not that I have a planned use case for this). Moreover, it's a bit more computation on the GPU to determine the coordinates of the sprite.

Another method I could use would be to specify an entire Rect which would delimit the position, width and height of the sprite within the texture map. However, this would require specifying 4 floats (16 bytes) of information, rather than a single texture index byte. Multiply that by, say, 200K instances and we're looking at about 3 MB of data (in addition to the other data). I don't know if that is considered "a lot" in today's day and age or not.

Should I be focusing on easing the computation in my GLSL shaders or minimizing the size of my buffers? I hear that transferring data to the GPU is often the bottleneck, but recopyng the data to the buffer will be very seldom compared to the number of vertices it has to render every frame.


Likewise, I'm considering taking out my model transform matrix and replacing it with a vec3 and vec2 for translation and rotation respectively (I only need 2 degrees of rotation) which would knock me down from 16 floats to 5, and then I can just rebuild the matrix in the vertex shader. Again, this takes away some flexibility, and I'm not sure of the cost savings.

I tried doing it the other way, specifying a texture rect rather than a byte index, and it actually yielded a huge speed increase (520 FPS to 3600 FPS, or 1.92ms/frame to 0.27 ms/frame).

It seems that reducing computation is more important, at least on my GPU (Radeon HD 5700 series). Or perhaps it's just modulus that's expensive, not sure. I'm quite pleased with the results though; I get more flexibility at a cheaper cost!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM