简体   繁体   English

OpenGL ES,Z缓冲区,2D精灵,废弃,性能

[英]OpenGL ES, Z-Buffer, 2D sprites, discard, performance

I have a retro-looking 2D game with a lot of sprites (reminiscent of Sega's Super Scaler arcades) which do not use semi-transparency. 我有一个复古的2D游戏,其中包含很多不使用半透明性的精灵(让人联想到Sega的Super Scaler街机游戏)。 I have thought about using the Z-Buffer over sorting to simplify things. 我已经考虑过使用Z缓冲区进行排序以简化事情。 Ok, but by default writes are done to the Z-buffer even though alpha is zero, giving the effect illustrated here: 好的,但是默认情况下,即使alpha为零,也会对Z缓冲区进行写操作,效果如下所示:

http://i.stack.imgur.com/ubLlp.png http://i.stack.imgur.com/ubLlp.png

Now, since I'm in OpenGL ES 2, I don't have alpha testing, so from what I understand my only possibility is to discard the pixel from the fragment shader if alpha is 0 so that it doesn't get written to the Z-Buffer. 现在,由于我使用的是OpenGL ES 2,所以我没有进行alpha测试,因此据我了解,我唯一的可能性是如果alpha为0,则从片段着色器中丢弃像素,以免将其写入到Z缓冲区。 But in terms of performance this is SO wrong: not only the if is slow, but the discard basically kills the purpose since it disables early depth testing and the result is way worse than doing it in software. 但是就性能而言,这是错误的:不仅if的速度很慢,而且discard基本无法达到目的,因为它禁用了早期的深度测试,其结果比在软件中做的还要差。

if (val.a < 0.5) {
    discard;
}

Is there any other solution I could use which would not kill the performance? 我还有其他可以使用的解决方案吗? Do all 2D games sort sprites themselves and not use depth buffer? 所有2D游戏都自己对精灵进行排序而不使用深度缓冲区吗?

It's a tradeoff really. 确实是一个权衡。 If you let the z-buffer do the sorting and use discard in your shaders then it's more expensive on the GPU because of branching and late depth testing as you say. 如果让z缓冲区进行排序并在着色器中使用丢弃,那么在GPU上的价格会更高,这是因为您所说的分支和后期深度测试。

If you do the depth sorting yourself, then you'll find it's harder to issue your draw calls in an optimal order (eg you'll keep having to change texture). 如果您自己进行深度排序,那么您会发现很难以最佳顺序发出绘图调用(例如,您将不得不更改纹理)。 Draw calls on GLES2 have a very significant CPU hit on lower end devices and the count will probably go up. 在低端设备上,GLES2上的绘图调用对CPU的影响很大,并且数量可能会增加。

If performance is a big concern, then probably the second option is better if you do it in conjunction with a big effort on the texture atlasing front to minimize your draw call count, this might be particularly effective if your sprites are low resolution retro sprites because you'll be able to get a lot of sprites per texture atlas. 如果性能是一个大问题,那么如果您在纹理贴图前端上花大力气以最大程度地减少绘制调用次数,那么第二种选择可能会更好,如果您的精灵是低分辨率的复古精灵,这可能会特别有效,因为您可以在每个纹理图集上获得大量精灵。 It isn't a clear winner by any stretch and I can imagine that different games take different approaches. 无论如何,这显然不是赢家,我可以想象不同的游戏采用不同的方法。

Also, you should take into account that the vast majority of target hardware is going to perform just fine whichever path you choose, and maybe you should just choose the one that is faster to implement and makes your code simpler (which is probably letting the z-buffer do the sorting). 另外,您应考虑到,无论选择哪种路径,绝大多数目标硬件都将正常运行,也许您应该只选择一种实现速度更快,使代码更简单的路径(这可能会使z -buffer进行排序)。

If you fancy a technical challenge, I've often thought the best approach might be divide up your sprites into fully opaque sections and sections with transparency and render the two parts as separate meshes (they won't be quads any more). 如果您遇到技术难题,我通常认为最好的方法可能是将子画面划分为完全不透明的部分和具有透明性的部分,并将这两个部分渲染为单独的网格(不再是四边形)。 You'd have to do a lot of preprocessing and draw a lot more triangles, but by being able to do some rendering with fully-opaque parts then you can take advantage of the hidden-surface-removal tech in all iOS devices and lots of Android devices. 您必须进行大量预处理并绘制更多三角形,但是通过能够使用完全不透明的部分进行渲染,您可以在所有iOS设备和许多其他设备中利用隐藏表面去除技术。 Android设备。 Certainly by doing this you should be able to reduce your fill rate burden, but at a cost of increased draw calls, and there might be an unnecessarily high amount of added complexity to your code and your tools. 当然,这样做可以减轻填充率的负担,但要以增加绘制调用为代价,并且代码和工具的复杂性可能会不必要地增加。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM