简体   繁体   English

通过找到将以60fps和30fps运行的最佳设置来摆脱性能设置旋钮

[英]Getting rid of performance setting knobs by finding best settings that will run at 60fps, and 30fps

Intro 介绍

I am developing a game where the quality of it is largley dependent on two factors. 我正在开发一款游戏,其质量取决于两个因素。

  1. How fancy the post processing effect, the fancier the better but also more gpu straining 后处理效果如何,鸽友越好但gpu应变也越大
  2. How many entities can be simulated, which puts strain on the cpu. 可以模拟多少个实体,这给cpu带来了压力。

The problem is it is a perfectly good game even without post processing which allows even pre 2010 devices to run it. 问题在于,即使没有后处理功能,这也是一款完美的游戏,即使在2010年之前的设备也无法运行。 And I have about 48 different levels of post processing, while most devices can run say level 10, other new devices can run level 48 processing and it looks significantly cooler. 我有大约48个不同级别的后期处理,而大多数设备可以运行10级,而其他新设备可以运行48级处理,而且看起来还不错。

My dilemma is I do not want my user to have to slog through an interface where the select how many entities, and tweak the two settings that effect post processing quality. 我的难题是我不希望我的用户不得不通过一个界面来搜索,该界面选择多少个实体,并调整影响后期处理质量的两个设置。 This is for two reasons. 这有两个原因。

  1. The main game already has plenty of interface elements on it, and I really do not want to add another. 主游戏上已经有很多界面元素,我真的不想添加其他元素。 It would take away from the minimalist, and magical look of the game. 它会脱离游戏的极简和神奇外观。
  2. I had a thought to give the user a 30fps and a 60fps option (best settings that would run at that fps) that has really become a goal for me. 我曾想过要为用户提供30fps和60fps的选项(以该fps运行的最佳设置),这对我来说确实是一个目标。

Question

However I am having issues implimenting my dream of a 30fps option and a 60fps option. 但是,我遇到的问题使我无法实现30fps和60fps的梦想。 Mostly because of two reasons that I am hoping you can help me out with. 主要是因为我希望您能为我提供帮助的两个原因。

  1. Profiling takes way to long! 分析需要很长时间! Too many tests cause this process to take too long. 测试太多会导致此过程花费太长时间。 Also the shorter I make the test the less accurate the results. 同样,我使测试越短,结果的准确性就越差。
  2. Profiling is not accurate, even though I cut out the data from the first and last 20% of the test it still gives slightly different results each time. 即使我从测试的前20%和最后20%切出数据,分析也不准确,但每次给出的结果仍然略有不同。
  3. Since this is going to be an ios and android app there are SO many devices to profile, it would be difficult to program in predictions. 由于这将是一个ios和android应用程序,因此要分析的设备太多,因此很难进行预测编程。

How would you test devices in order to figure out the best settings for 30fps and 60fps? 您将如何测试设备以找出30fps和60fps的最佳设置? No need to show code (articles would be nice) just explain the process. 无需说明代码(文章会很好),只需说明该过程即可。

If you want to read some more here is my current method. 如果您想了解更多信息,请参见这里。

My Current Method 我目前的方法

Here is how I do it currently, and my test (although taking 3 minutes and being slightly inaccurate) has some assumptions that are making it faster. 这是我目前的操作方式,我的测试(尽管花了3分钟时间,但略有不准确)有一些假设,使测试速度更快。

  1. Each new entity does not add significant time to GPU rendering time, meaning that the CPU and GPU tests can be done seperatly, rather then testing combinations. 每个新实体不会在GPU渲染时间上增加大量时间,这意味着可以分别完成CPU和GPU测试,而不是测试组合。
  2. Post processing is handled by two settings 后处理由两个设置处理
    1. Downs-sample size: Basically the post processing effect is rendered at a lower resolution and fit onto a higher one. Downs-sample大小:基本上,后处理效果以较低的分辨率渲染并适合较高的分辨率。 So if you down sample to 30% of the device resolution that means 70% less pixels do expensive calculations. 因此,如果将采样率降低到设备分辨率的30%,则意味着减少70%的像素会进行昂贵的计算。 The quality is also lower. 质量也较低。 I have simplified this by making 6 different levels of downsampling. 我通过进行6种不同级别的下采样来简化了此过程。
    2. Texture samples: All the processing that is done is convolution, and so a lot of texture samples need to happen for each pixel in the post processing stage. 纹理样本:完成的所有处理都是卷积,因此在后期处理阶段,每个像素都需要发生许多纹理样本。 Currently for simplicity I came up with 8 different levels of this level 1 samples the texture 3 times per pixel, level 2 samples the texture 7 times per pixel etc. More samples = a more accurate and dramatic effect 目前,为简单起见,我想出了8个不同的级别,该级别1对每个像素采样3次纹理,级别2对每个像素采样7次,依此类推。更多的采样=更准确,更生动的效果

However my method has some issues clouding its data, these are problems mostly because I can not predict how long these spikes will last and throw off data: 但是我的方法存在一些使数据模糊的问题,这些问题主要是因为我无法预测这些峰值将持续多长时间并抛出数据:

  1. Every time I change one of the GPU settings there is a lag spike, so for changing texture samples the uniform causes the code to branch differently and the GPU recompiles, and for changing downsampling I have to regenerate Frame buffers. 每次更改GPU设置之一时,都会有一个滞后尖峰,因此要更改纹理样本,统一会导致代码分支不同,GPU重新编译,而要更改下采样,则必须重新生成帧缓冲区。
  2. There is also a lag spike at the beginning of each new CPU test because of creating new particles. 由于创建了新的粒子,因此在每个新的CPU测试开始时也会出现延迟尖峰。
  3. Profiling starts at the beginning of the launch, even though I wait two seconds some times it still isnt fully steady yet, and I will get really low results on the first test. 分析是在启动开始时开始的,即使我等待了两秒钟有时仍无法完全稳定,并且在第一次测试中我得到的结果确实很低。
  4. Some devices can go ridiculously high on the particle count, ironically the better the device the longer my cpu test takes. 某些设备的粒子数量可能高得离谱,可笑的是,设备越好,我的CPU测试时间就越长。

Right now in order to test I am simply running the game as usual, and each new test I tweak the particle ammount, and gpu Ammount. 现在为了进行测试,我只是像往常一样运行游戏,并且每次进行新测试时,我都会调整粒子数量和gpu数量。

I first test the CPU, and turn post processing off. 我首先测试CPU,然后关闭后处理。 I start at 0 entities then go up by 100 entities. 我从0个实体开始,然后增加100个实体。 When the fps seems to drop below the target level (60fps then 30fps) then I test the one before it again. 当fps似乎降至目标水平以下(60 fps然后30 fps)时,我再次对其进行测试。 If this new test passed (ex: 500) but the last test failed (ex: 600) then I know 500 is the max amount of entities at that fps. 如果此新测试通过(例如:500),但最后一次测试失败(例如:600),则我知道500是该fps的最大实体数量。

Then I test the GPU basically the same thing as the last test except I give the GPU more work each time. 然后,我对GPU的测试与上次测试基本相同,只是每次都给GPU做更多工作。 I broke it up into 4 tests. 我将其分为4个测试。 Luxury: Seeing if it is high end and finding the best setting, Level 6: Does level 6 texture sampling, and finds the lowest downsampling possible, Level 5: Does level 5 texture sampling and finds the lowest downsampling possible, Remedial: Does its best to find a glow setting that will work at maximum downsampling. 豪华:查看高端产品并找到最佳设置,级别6:进行6级纹理采样,并找到最低的下采样,级别5:进行5级纹理采样,并找到最低的下采样,补救措施:尽力而为找到可以在最大下采样下工作的辉光设置。

The texture sampling levels 5 and 6 are my goal. 我的目标是纹理采样级别5和6。 I dont really care how down-sampled the texture is. 我真的不在乎纹理是如何下采样的。 I would choose level 6 with down-sampling by 70% over level 5 with no down-sampling. 我会选择6级采样率比没有5级采样率降低了70%的采样率低的采样率。 Texture sampling levels 1-4 are remedial for the really low end devices, I only ever test these at the highest level of down-sampling. 纹理采样级别1-4是真正低端设备的补救措施,我只在最高级别的下采样下进行测试。 If I get something that works then that is great, if not then no glow is rendered. 如果我得到了有用的东西,那就太好了,如果没有,那么就不会发光。 Texture sampling levels 7 and 8 are only for high end devices, and they only get tested with no down-sampling. 纹理采样级别7和8仅适用于高端设备,并且仅在不进行下采样的情况下进行了测试。

To speed up the GPU test I do it in the order Luxury-> Level 6-> Level 5->Remedial and if one of those passes it skips the other tests. 为了加快GPU测试的速度,我按照“豪华-> 6级-> 5级->补救”的顺序进行,如果其中一项通过,则跳过其他测试。

Your big problem is going to be thermal stability - most modern mid-end/high-end phones can generate more heat than they can dissipate if CPU and GPU are both running at maximum. 您的大问题将是热稳定性-如果CPU和GPU都在最大程度地运行,则大多数现代中端/高端手机产生的热量多于其散发的热量。 They may sustain a measurably faster performance point for the initial 5-10 minutes after a cold start, but eventually will warm up enough so they need to down-clock. 在冷启动后的最初5到10分钟内,它们可能会保持明显更快的性能点,但最终会足够热,因此需要降低时钟频率。 So your profiling problems become time variant, and you risk picking settings which work well for the first 10 minutes and then hit problems. 因此,您的分析问题会随时间变化,您可能会冒险选择在前10分钟内效果良好的设置,然后再遇到问题。

If you want a consistent user-experience you probably want to leave quite a bit of headroom rather than dialing things up as far as they can possibly go, and if you are leaving headroom the lack of 100% stability in the results matters less. 如果您希望获得一致的用户体验,则可能需要留出足够的净空,而不是尽一切可能增加拨号空间;如果要保留净空,则结果缺乏100%稳定性的重要性就较小。

Personally as a user I don't find this type of setting too much of a problem - in your case it sounds like all you need is two sliders, one for CPU and GPU. 就用户个人而言,我认为这种设置没有太大问题-在您的情况下,您似乎只需要两个滑块,一个用于CPU和GPU。 I often turn down graphics deliberately below what my device is capable if the game allows to ensure my battery lasts longer (I play when I'm travelling so don't always have easy access to a charger). 如果游戏可以确保我的电池续航时间更长(我在旅途中玩耍,因此并不总是可以轻松使用充电器),那么我经常故意将图形降低到我的设备无法提供的功能之下。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM