简体   繁体   English

glDrawElements在iOS上大量使用cpu

[英]glDrawElements massive cpu usage on iOS

Hardware: iPad2 Sofware: OpenGL ES 2.0 C++ 硬件:iPad2软件:OpenGL ES 2.0 C ++

glDrawElements seems to take up about 25% of the cpu. glDrawElements似乎占用了大约25%的cpu。 Making the CPU 18ms and the GPU 10ms per frame. 使CPU 18ms和GPU每帧10ms。

When I don't use an index buffer and use glDrawArrays, it speeds up and glDrawArrays barley shows up on the profiler. 当我不使用索引缓冲区并使用glDrawArrays时,它会加速并且glDrawArrays大麦会显示在探查器上。 Everything else is the same, glDrawArrays has more verts because I have to duplicate verts in the VBO without the index buffer. 其他一切都是一样的,glDrawArrays有更多的顶点因为我必须在没有索引缓冲区的情况下复制VBO中的顶点。

so far: 至今:

  • virtually the same amount of state changes between the two methods 这两种方法之间的状态变化几乎相同
  • vertex structure is two floats(8 bytes). 顶点结构是两个浮点数(8个字节)。
  • indexbuffer is 16bit(tried 32bit as well) indexbuffer是16bit(尝试32位)
  • GL_SATIC_DRAW for both buffers 两个缓冲区的GL_SATIC_DRAW
  • buffers don't change after load 缓冲区在加载后不会改变
  • the same VBO and the indexbuffer render multiple times per frame, with different offsets and sizes 相同的VBO和indexbuffer每帧渲染多次,具有不同的偏移和大小
  • no opengl errors 没有opengl错误

So it looks like it's doing a software fallback of some sort. 所以看起来它正在做某种类型的软件回退。 But I can't figure out what would cause OpenGL to fallback. 但我无法弄清楚会导致OpenGL回退的原因。

There are a few things that immediately jump to mind that might affect speed the way you describe. 有一些事情会立即浮现在脑海中,这可能会影响您描述的速度。

For one, many commands are issued passively to reduce the number of bus transfers. 例如,许多命令被动地发出以减少总线传输的数量。 They are queued up and wait for the next batch transfer. 它们排队等待下一批转移。 State changes, texture changes, and similar commands all accumulate. 状态更改,纹理更改和类似命令都会累积。 It is possible that the the draw commands are triggering a larger transfer in the one case but not in the other, or that you are triggering more frequent transfers in the one case or the other. 绘制命令可能在一种情况下触发更大的转移而在另一种情况下不触发,或者您在一种情况下触发更频繁的转移。 For another, your specific models might be better organized for one or the other draw calls. 另一方面,您的特定模型可能会更好地组织一个或另一个绘制调用。 You need to look at how big they are, if they reuse index values, and if they are optimized or reordered for rendering. 您需要查看它们的大小,是否重用索引值,以及它们是否经过优化或重新排序以进行渲染。 glDrawArrays may require more data to be transferred, but if your models are small the overhead may not be much of a concern. glDrawArrays可能需要传输更多数据,但如果您的模型很小,则开销可能不会太大。 Draw frequency becomes important since you want to queue off calls frequently to keep the card busy and let your CPU do other work, you don't want it to just accumulate in the command buffer waiting to be sent, but it needs to be balanced since there is a cost with those transfers. 绘制频率变得很重要,因为你想经常排队,以保持卡忙,让你的CPU做其他工作,你不希望它只是累积在等待发送的命令缓冲区,但它需要平衡,因为这些转移需要付出代价。 And to top it off, frequently indexed values can benefit from cache effects when they are frequently reused, but linearly accessed arrays can benefit from cache effects when they are accessed linearly, so you need to know your data since different types of data benefit from different methods. 最重要的是,经常索引的值可以在频繁重用时从缓存效果中受益,但线性访问的数组在线性访问时可以从缓存效果中受益,因此您需要知道您的数据,因为不同类型的数据受益于不同的方法。

Even Apple seems to be unsure which method to use. 甚至Apple似乎也不确定使用哪种方法。

Up until iOS7 the OpenGL ES Programming Guide for IOS for that version and earlier wrote: 直到iOS7,该版本和之前的IOS OpenGL ES编程指南写道:

For best performance, your models should be submitted as a single unindexed triangle strip using glDrawArrays with as few duplicated vertices as possible. 为了获得最佳性能,您的模型应使用glDrawArrays作为单个无索引三角形条提交,并尽可能少复制顶点。 If your models require many vertices to be duplicated (...), you may obtain better performance using a separate index buffer and calling glDrawElements instead. 如果模型需要复制许多顶点(...),则可以使用单独的索引缓冲区并调用glDrawElements来获得更好的性能。 ... For best results, test your models using both indexed and unindexed triangle strips, and use the one that performs the fastest. ...为获得最佳效果,请使用索引和未编制索引的三角形条测试模型,并使用速度最快的三角形条。

But their updated OpenGL ES Programming Guide for iOS that applies to iOS8 offers the opposite: 但他们更新的适用于iOS8的OpenGL ES编程指南提供了相反的结果:

For best performance, your models should be submitted as a single indexed triangle strip. 为获得最佳性能,您的模型应作为单个索引三角形条提交。 To avoid specifying data for the same vertex multiple times in the vertex buffer, use a separate index buffer and draw the triangle strip using the glDrawElements function 要避免在顶点缓冲区中多次指定同一顶点的数据,请使用单独的索引缓冲区并使用glDrawElements函数绘制三角形条带

It looks like in your case you have just tried both, and found that one method is better suited for your data. 看起来在你的情况下你刚刚尝试了两种方法,并发现一种方法更适合你的数据。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM