使用std :: chrono :: steady_clock在线程/异步中对代码进行基准测试

Question

Suppose I have lots of computations that I want to run (and benchmark CPU time) in multiple threads. 假设我有很多要在多个线程中运行的计算（以及基准CPU时间）。 As a toy example: 作为玩具示例：

#include <chrono>
#include <future>
#include <iostream>
#include <vector>


using unit_t = std::chrono::nanoseconds;

unit_t::rep expensive_computation() {
    auto start = std::chrono::steady_clock::now();
    // Something time-consuming here...
    auto end = std::chrono::steady_clock::now();

    auto duration = std::chrono::duration_cast<unit_t>(end - start).count();

    return duration;
}

int main() {
    std::vector<std::future<unit_t::rep>> computations;

    for (int i = 0; i < 100; i++) {
        computations.push_back(std::async(expensive_computation));
    }

    for (size_t i = 0; i < computations.size(); i++) {
        auto duration = computations[i].get();
        std::cout << "#" << i << " took " << duration << "ns" << std::endl;
    }
}

I'm concerned that since steady_clock is montonic across threads the underlying clock ticks per process and not per thread (if any thread is scheduled the clock ticks for all threads). 我担心的是，由于steady_clock在各个线程steady_clock是蒙脱的，因此每个进程而不是每个线程的底层时钟滴答声（如果计划了任何线程，则所有线程的时钟滴答声）。 This would mean that if a thread were sleeping, steady_clock would still be ticking for it and this time would incorrectly be included in the duration for that thread. 这意味着如果一个线程正在睡眠，则steady_clock仍会为其计时，并且该时间将错误地包含在该线程的duration中。 Is my suspicion correct? 我的怀疑正确吗？ Or does steady_clock tick only for thread CPU time within a thread? 还是steady_clock仅在线程内的线程CPU时间上打勾？

Put another way, is this approach a safe way to independently time lots of computations (such that no CPU time spent on one thread will affect the duration of another thread)? 换句话说，这种方法是独立计时大量计算的安全方法吗（这样一个线程上没有CPU时间会影响另一个线程的duration ）？ Or do I need to spin off separate processes for each computation to make the steady_clock only tick when the computation is running/scheduled? 还是我需要为每个计算剥离单独的进程，以使只有在运行/计划计算时才使steady_clock ？

edit: I also recognize that spinning up more threads than cores may be an inefficient approach to this problem (although, I don't particularly care about computation throughput; moreover, I just want them all as a group to complete in the fastest time). 编辑：我还认识到，将线程多于内核可能是解决此问题的低效方法（尽管我并不特别在乎计算吞吐量；而且，我只希望它们作为一个整体在最快的时间内完成）。 I suspect in practice, I'd need to maintain a small-constant bounded list of threads in flight (say capped at the number of cores) and only start new computations as a core becomes available. 我怀疑在实践中，我需要维护一个正在运行的线程的恒定常量列表（例如，限制在内核数量之内），并且只有在内核可用时才开始新的计算。 But, this shouldn't have an impact on timing that I care about above; 但是，这不应该影响我上面所关心的时间； it should only affect the wall clock time. 它只影响挂钟时间。

Answer 1

The standard specifies that steady_clock model physical time (as opposed to CPU time). 该标准指定了steady_clock模型的物理时间（而不是CPU时间）。

From [time.clock.steady]: 从[time.clock.steady]开始：

Objects of class steady_clock represent clocks for which values of time_point never decrease as physical time advances and for which values of time_point advance at a steady rate relative to real time. steady_clock类的steady_clock表示以下时钟：对于它们的时钟，time_point的值从不随物理时间的增长而减小，并且针对它们的time_point值相对于实时以稳定的速率增长。 That is, the clock may not be adjusted. 即，时钟可能无法调整。

That being said, how well an implementation models physical time is a QOI issue. 话虽这么说，实现如何很好地模拟物理时间是QOI问题。 Nevertheless, your code looks fine to me. 不过，您的代码对我来说还不错。

Should your experimentations prove unsatisfactory, clients of <chrono> can also author their own custom clocks that will have first class status within the <chrono> library. 如果您的实验不能令人满意， <chrono>客户端也可以编写自己的自定义时钟，这些时钟在<chrono>库中具有一流的状态。

Answer 2

This would mean that if a thread were sleeping, steady_clock would still be ticking for it and this time would incorrectly be included in the duration for that thread. 这意味着如果一个线程正在睡眠，则steady_clock仍会为其计时，并且该时间将错误地包含在该线程的持续时间中。

That won't be incorrectly though, as the standard specifies for class std::chrono::steady_clock that it measures physical time, not CPU time or any other time. 不过，这并不是错误的 ，正如该标准为类std::chrono::steady_clock指定的std::chrono::steady_clock ，它测量物理时间，而不是CPU时间或任何其他时间。 See here under [time.clock.steady] : 参见[time.clock.steady]下的内容：

Objects of class steady_clock represent clocks for which values of time_point never decrease as physical time advances and for which values of time_point advance at a steady rate relative to real time ... steady_clock类的steady_clock表示以下时钟：对于它们的时钟， time_point值永远不会随着物理时间的time_point而减少，并且对于它们的time_point值相对于实时以稳定的速率增长...

That said, your code looks fine in that it will give you the time measured for each thread run. 就是说，您的代码看起来不错，因为它将为您提供每个线程运行所测量的时间。 Is there a reason for you to want to measure CPU time here? 您是否有理由在这里测量CPU时间？ If so then let me know in the comments. 如果是这样，请在评论中让我知道。

使用std :: chrono :: steady_clock在线程/异步中对代码进行基准测试

问题描述

2 个解决方案

解决方案1
3 已采纳 2018-08-26 15:12:25

解决方案2
2 2018-08-26 15:19:54

使用std :: chrono :: steady_clock在线程/异步中对代码进行基准测试

问题描述

2 个解决方案

解决方案1 3 已采纳 2018-08-26 15:12:25

解决方案2 2 2018-08-26 15:19:54

解决方案1
3 已采纳 2018-08-26 15:12:25

解决方案2
2 2018-08-26 15:19:54