简体   繁体   English

Android GPU分析 - OpenGL动态壁纸很慢

[英]Android GPU profiling - OpenGL Live Wallpaper is slow

I'm developing a Live Wallpaper using OpenGL ES 3.0. 我正在使用OpenGL ES 3.0开发动态壁纸。 I've set up according to the excellent tutorial at http://www.learnopengles.com/how-to-use-opengl-es-2-in-an-android-live-wallpaper/ , adapting GLSurfaceView and using it inside the Live Wallpaper. 我根据http://www.learnopengles.com/how-to-use-opengl-es-2-in-an-android-live-wallpaper/上的优秀教程进行设置,调整GLSurfaceView并在里面使用它动态壁纸。

I have a decent knowledge of OpenGL/GLSL best practices, and I've set up a simple rendering pipeline where the draw loop is as tight as possible. 我对OpenGL / GLSL最佳实践有很好的了解,并且我已经建立了一个简单的渲染管道,其中绘制循环尽可能紧密。 No re-allocations, using one static VBO for non-changing data, a dynamic VBO for updates, using only one draw call, no branching in the shaders et cetera. 没有重新分配,使用一个静态VBO用于非变化数据,一个动态VBO用于更新,仅使用一个绘制调用,在着色器等中没有分支。 I usually get very good performance, but at seemingly random but reoccurring times, the framerate drops. 我通常会获得非常好的表现,但在看似随机但又重复的时候,帧速率会下降。

Profiling with the on-screen bars gives me intervals where the yellow bar ("waiting for commands to complete") shoots away and takes everything above the critical 60fps threshold. 使用屏幕条形图进行分析可以得到黄色条(“等待命令完成”)射击的间隔,并将所有内容都高于关键的60fps阈值。

截图

I've read any resources on profiling and interpreting those numbers I can get my hands on, including the nice in-depth SO question here . 我已经阅读了有关剖析和解释我可以得到的那些数字的任何资源,包括这里很好的深入的SO问题 However, the main takeaway from that question seems to be that the yellow bar indicates time spent on waiting for blocking operations to complete , and for frame dependencies . 然而,该问题的主要内容似乎是黄色条表示等待阻塞操作完成所花费的时间,以及帧依赖性 I don't believe I have any of those, I just draw everything at every frame. 我不相信我有任何这些,我只是在每一帧画出一切。 No reading . 没有阅读

My question is broad - but I'd like to know what things can cause this type of framerate drop, and how to move forward in pinning down the issue. 我的问题很广泛 - 但我想知道什么事情会导致这种类型的帧率下降,以及如何推进这个问题。

Here are some details that may or may not have impact: 以下是可能会或可能不会产生影响的一些细节:

  • I'm rendering on demand, onOffsetsChanged is the trigger (render when dirty). 我按需渲染,onOffsetsChanged是触发器(脏时渲染)。
  • There is one single texture (created and bound only once), 1024x1024 RGBA. 有一个单一纹理(仅创建并绑定一次),1024x1024 RGBA。 Replacing the one texture2D call with a plain vec4 seems to help remove some of the framerate drops. 用普通的vec4替换一个texture2D调用似乎有助于删除一些帧速率下降。 Reducing the texture size to 512x512 does nothing for performance. 将纹理大小减小到512x512对性能没有任何帮助。
  • The shaders are not complex, and as stated before, contain no branching. 着色器并不复杂,如前所述,不包含分支。
  • There is not much data in the scene. 场景中没有太多数据。 There are only ~300 vertices and the one texture. 只有约300个顶点和一个纹理。
  • A systrace shows no suspicious methods - the GL related methods such as buffer population and state calls are not on top of the list. systrace没有显示可疑方法 - GL相关方法(如缓冲区填充和状态调用)不在列表的顶部。

Update: As an experiment, I tried to render only every other frame , not requesting a render every onOffsetsChanged (swipe left/right). 更新:作为一个实验,我试图渲染每一个帧 ,而不是每次onOffsetsChanged请求渲染(向左/向右滑动)。 This was horrible for the look and feel, but got rid of the yellow lag spikes almost completely. 这对于外观和感觉来说太可怕了,但几乎完全摆脱了黄色的滞后尖峰。 This seems to tell me that doing 60 requests per frame is too much, but I can't figure out why. 这似乎告诉我每帧执行60个请求太多了,但我无法弄清楚原因。

My question is broad - but I'd like to know what things can cause this type of framerate drop, and how to move forward in pinning down the issue. 我的问题很广泛 - 但我想知道什么事情会导致这种类型的帧率下降,以及如何推进这个问题。

(1) Accumulation of render state. (1)渲染状态的累积。 Make sure you "glClear" the color/depth/stencil buffers before you start each render pass (although if you are rendering directly to the window surface this is unlikely to be the problem, as state is guaranteed to be cleared every frame unless you set EGL_BUFFER_PRESERVE). 确保在开始每个渲染过程之前“glClear”颜色/深度/模板缓冲区(尽管如果直接渲染到窗口表面,这不太可能是问题,因为除非您设置状态,否则保证每帧都清除状态EGL_BUFFER_PRESERVE)。

(2) Buffer/texture ghosting. (2)缓冲/纹理重影。 Rendering is deeply pipelined, but OpenGL ES tries to present a synchronous programming abstraction. 渲染是深度流水线的,但OpenGL ES试图呈现同步编程抽象。 If you try to write to a buffer (SubBuffer update, SubTexture update, MapBuffer, etc) which is still "pending" use in a GPU operation still queued in the pipeline then you either have to block and wait, or you force a copy of that resource to be created. 如果您尝试写入缓冲区(SubBuffer更新,SubTexture更新,MapBuffer等)仍在“待定”中使用仍然在管道中排队的GPU操作,那么您必须要阻止并等待,或者您强制复制要创建的资源。 This copy process can be "really expensive" for large resources. 对于大型资源,此复制过程可能“非常昂贵”。

(3) Device DVFS (dynamic frequency and voltage scaling) can be quite sensitive on some devices, especially for content which happens to sit just around a level decision point between two frequencies. (3)设备DVFS(动态频率和电压缩放)在某些设备上可能非常敏感,特别是对于恰好位于两个频率之间的水平决策点附近的内容。 If the GPU or CPU frequency drops then you may well get a spike in the amount of time a frame takes to process. 如果GPU或CPU频率下降,那么您可能会在帧处理的时间内出现峰值。 For debug purposes some devices provide a means to fix frequency via sysfs - although there is no standard mechnanism. 出于调试目的,某些设备提供了一种通过sysfs修复频率的方法 - 尽管没有标准的机制。

(4) Thermal limitations - most modern mobile devices can produce more heat than they can dissipate if everything is running at high frequency, so the maximum performance point cannot be sustained. (4)散热限制 - 如果一切都以高频率运行,大多数现代移动设备可以产生比散热更多的热量,因此最大性能点无法持续。 If your content is particularly heavy then you may find that thermal management kicks in after a "while" (1-10 minutes depending on device, in my experience) and forcefully drops the frequency until thermal levels drop within safe margins. 如果您的内容特别沉重,那么您可能会发现热管理在“一段时间”后开始(根据我的经验,1-10分钟取决于设备)并强制降低频率,直到热量水平下降到安全范围内。 This shows up as somewhat random increases in frame processing time, and is normally unpredictable once a device hits the "warm" state. 这表示帧处理时间有些随机增加,并且一旦设备达到“暖”状态,通常是不可预测的。

If it is possible to share an API sequence which reproduces the issue it would be easier to provide more targeted advice - the question is really rather general and OpenGL ES is a very wide API ;) 如果可以共享再现问题的API序列,那么提供更有针对性的建议会更容易 - 这个问题非常普遍,OpenGL ES是一个非常广泛的API;)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM