简体   繁体   中英

which is better for CPU usage waiting for a function returned with std::future wait() or check a flag sleep for a while in a loop?

Q1: which occupies less CPU usage, future wait() or check flag in a while loop?

std::atomic_bool isRunning{false};

void foo(){
    isRunning.store(true);
    doSomethingTimeConsuming();
    isRunning.store(false);
}

std::future f = std::async(std::launch::async, foo);

use std::future wait():

if(f.vaild())
   f.wait()

check flag in a while loop:

if(f.valid){
    while(isRunning.load())
       std::this_thread::sleep_for(1ms);
}

Q2: is the conclusion also applied to std::thread.join() or std::condition_variable.wait() ?

thanks in advance.

std::this_thread::sleep_for keeps waking up the thread unnecessarily at wrong times. The average latency of the result being ready and the waiter thread noticing it is half the sleep_for timeout.

std::future::wait is more efficient because it blocks in the kernel till the result is ready, without doing multiple syscalls unnecessarily, unlike std::this_thread::sleep_for .

If your run the two versions with

void doSomethingTimeConsuming() {
    std::this_thread::sleep_for(1s);
}

under perf stat , the results for std::future::wait are:

          1.803578      task-clock (msec)         #    0.002 CPUs utilized          
                 2      context-switches          #    0.001 M/sec                  
                 0      cpu-migrations            #    0.000 K/sec                  
               116      page-faults               #    0.064 M/sec                  
         6,356,215      cycles                    #    3.524 GHz                    
         4,511,076      instructions              #    0.71  insn per cycle         
           835,604      branches                  #  463.304 M/sec                  
            22,313      branch-misses             #    2.67% of all branches        

Whereas for std::this_thread::sleep_for(1ms) :

         11.715249      task-clock (msec)         #    0.012 CPUs utilized          
               901      context-switches          #    0.077 M/sec                  
                 6      cpu-migrations            #    0.512 K/sec                  
               118      page-faults               #    0.010 M/sec                  
        40,177,222      cycles                    #    3.429 GHz                      
        25,401,055      instructions              #    0.63  insn per cycle
         2,286,806      branches                  #  195.199 M/sec  
           156,400      branch-misses             #    6.84% of all branches        

Ie in this particular test, sleep_for burns roughly 6 times as many CPU cycles.


Note that there is a race condition between isRunning.load() and isRunning.store(true) . A fix is to initialize isRunning{true}; .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM