简体   繁体   English

为什么我的渲染线程占用100%的CPU?

[英]Why is my rendering thread taking up 100% cpu?

So right now in my OpenGL game engine, when my rendering thread has literally nothing to do, it's taking up the maximum for what my CPU can give it. 因此,现在在我的OpenGL游戏引擎中,当我的渲染线程几乎无事可做时,它占用了我的CPU可以提供的最大资源。 Windows Task Manager shows my application taking up 25% processing (I have 4 hardware threads, so 25% is the maximum that one thread can take). Windows任务管理器显示我的应用程序占用了25%的处理能力(我有4个硬件线程,因此25%是一个线程可以容纳的最大值)。 When I don't start the rendering thread at all I get 0-2% (which is worrying on it's own since all it's doing is running an SDL input loop). 当我完全不启动渲染线程时,我得到0-2%(这令人担忧,因为它所做的只是运行SDL输入循环)。

So, what exactly is my rendering thread doing? 那么,我的渲染线程到底在做什么? Here's some code: 这是一些代码:

Timer timer;

while (gVar.running)
{
   timer.frequencyCap(60.0);

   beginFrame();
   drawFrame();
   endFrame();
}

Let's go through each of those. 让我们逐一介绍一下。 Timer is a custom timer class I made using SDL_GetPerformanceCounter . Timer是我使用SDL_GetPerformanceCounter制作的自定义计时器类。 timer.frequencyCap(60.0); is meant to ensure that the loop doesn't run more than 60 times per second. 旨在确保循环每秒运行不超过60次。 Here's the code for Timer::frequencyCap() : 这是Timer::frequencyCap()的代码:

double Timer::frequencyCap(double maxFrequency)
{
    double duration;

    update();
    duration = _deltaTime;
    if (duration < (1.0 / maxFrequency))
    {
        double dur = ((1.0 / maxFrequency) - duration) * 1000000.0;
        this_thread::sleep_for(chrono::microseconds((int64)dur));
        update();
    }

    return duration;
}

void Timer::update(void)
{
    if (_freq == 0)
        return;

    _prevTicks = _currentTicks;
    _currentTicks = SDL_GetPerformanceCounter();

      // Some sanity checking here. //
      // The only way _currentTicks can be less than _prevTicks is if we've wrapped around to 0. //
      // So, we need some other way of calculating the difference.
    if (_currentTicks < _prevTicks)
   {
         // If we take difference between UINT64_MAX and _prevTicks, then add that to _currentTicks, we get the proper difference between _currentTicks and _prevTicks. //
      uint64 dif = UINT64_MAX - _prevTicks;

         // The +1 here prvents an off-by-1 error.  In truth, the error would be pretty much indistinguishable, but we might as well be correct. //
      _deltaTime = (double)(_currentTicks + dif + 1) / (double)_freq;
   }
   else
      _deltaTime = (double)(_currentTicks - _prevTicks) / (double)_freq;
}

The next 3 functions are considerably simpler (at this stage): 接下来的3个功能非常简单(在此阶段):

void Renderer::beginFrame()
{
      // Perform a resize if we need to. //
   if (_needResize)
   {
      gWindow.getDrawableSize(&_width, &_height);
      glViewport(0, 0, _width, _height);
      _needResize = false;
   }

   glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT | GL_STENCIL_BUFFER_BIT);
}

void Renderer::endFrame()
{
   gWindow.swapBuffers();
}

void Renderer::drawFrame()
{
}

The rendering thread was created using std::thread. 渲染线程是使用std :: thread创建的。 The only explanation I can think of is that timer.frequencyCap somehow isn't working, except I use that exact same function in my main thread and I idle at 0-2%. 我能想到的唯一解释是timer.frequencyCap某种程度上不起作用,除了我在主线程中使用了完全相同的函数,并且我空闲在0-2%。

What am I doing wrong here? 我在这里做错了什么?

If V-Sync is enabled and your program honors the the swap intervals, then you seeing your program taking up 100% is actually an artifact how Windows measures CPU time. 如果启用了V-Sync,并且您的程序遵守了交换间隔,那么您会看到程序实际上占用了100%,这是Windows如何测量CPU时间的产物。 It's been a long known issue, but anytime your program blocks in a driver context (which is what happens when OpenGL blocks on a V-Sync) windows will account this for the program actually consuming CPU time, while its actually just idling. 这是一个众所周知的长期问题,但是只要您的程序在驱动程序上下文中发生阻塞(这就是在V-Sync上OpenGL阻塞时发生的情况),Windows就会将此解释为实际上消耗CPU时间的程序,而实际上它只是在空闲。

If you add a Sleep(1) right after swap buffers it will trick Windows into a more sane accounting; 如果在交换缓冲区之后立即添加Sleep(1) ,它将使Windows陷入更加理智的状态; on some systems even a Sleep(0) does the trick. 在某些系统上,即使Sleep(0)也能解决问题。

Anyway, the 100% are just a cosmetic problem, most of the time. 无论如何,多数情况下100%只是一个外观问题。


In the past weeks I've done some exhaustive research on low latency rendering (ie minimizing the time between user input and corresponding photons coming out of the display), since I'm getting a VR headset soon. 在过去的几周中,由于我即将购买VR头戴式耳机,因此我对低延迟渲染进行了详尽的研究(即,最大限度地减少了用户输入和从显示屏出来的相应光子之间的时间)。 And here's what I found out regarding timing SwapBuffers: The sane solution to the problem is actually to time the frame rendering times and add an artificial sleep before SwapBuffers so that you wake up only a few ms before the V-Sync. 这是我发现的有关定时SwapBuffers的发现:解决问题的明智方法实际上是计时帧渲染时间,并在SwapBuffers之前添加一个人工睡眠,这样您才可以在V-Sync之前醒来几毫秒。 However this is easier said than done because OpenGL is highly asynchronous and explicitly adding syncs will slow down your throughput. 但是,这说起来容易做起来难,因为OpenGL是高度异步的,显式添加同步会降低吞吐量。

if you have a complex scene or non optimized rendering 如果您有复杂的场景或未优化的渲染

  • hit bottleneck somewhere or have an error in gl code 遇到瓶颈或gl代码出错
  • then framerate usually drops to around 20 fps (at least on NVidia) no matter the complexity of the scene 然后无论场景的复杂程度如何,帧率通常都会降至20 fps(至少在NVidia上如此)
  • for very complex scenes even bellow that 对于非常复杂的场景

try this: 尝试这个:

  1. try to measure time it takes this to process 尝试衡量处理此过程所需的时间

     beginFrame(); drawFrame(); endFrame(); 
    • there you will see your fps limit 在那里,您将看到fps限制
    • compare it to scene complexity/HW capability 将其与场景复杂度/硬件功能进行比较
    • and decide if it is a bug or too complex scene 并确定是错误还是过于复杂的场景
    • try to turn off some GL stuff 尝试关闭一些GL的东西
    • for example last week I discover that if I turn CULL_FACE off it actually speeds up one of mine non optimized rendering about 10-100 times which I don't get why till today (on old stuff GL code) 例如上周,我发现如果我关闭CULL_FACE,它实际上可以将我的非优化渲染速度提高大约10到100倍,而直到今天我仍然不知道为什么(使用旧版GL代码)
  2. check for GL errors 检查GL错误

  3. I do not see any glFlush()/glFinish() in your code 我在您的代码中看不到任何glFlush()/ glFinish()

    • try to measure with glFinish(); 尝试使用glFinish()进行测量;
  4. If you cant sort this out you still can use dirty trick like 如果您不能解决这个问题,您仍然可以使用类似

    • add Sleep(1); 添加Sleep(1); to your code 到你的代码
    • it will force to sleep your thread so it will never use 100% power 它将迫使您的线程休眠,因此永远不会使用100%的功率
    • the time it sleeps is 1ms + scheduler granularity so it also limits the target fps 休眠时间为1ms +调度程序粒度,因此也限制了目标fps
    • you use this_thread::sleep_for(chrono::microseconds((int64)dur)); 您使用this_thread::sleep_for(chrono::microseconds((int64)dur));
    • do not know that function are you really sure it does what you think? 不知道该功能真的确定吗?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM