即使使用VDSO，clock_gettime也可能非常慢

Question

I'm using CentOS Linux release 7.3.1611 on Intel(R) Xeon(R) CPU E5-2667 v4 @ 3.20GHz 我在3.20GHz的Intel®Xeon®CPU E5-2667 v4上使用CentOS Linux版本7.3.1611

During tests of my userspace application, I have noticed that clock_gettime(CLOCK_MONOTONIC, &ts) may take up to 5-6 microseconds instead of ~23 nanoseconds in average. 在测试用户空间应用程序时，我注意到clock_gettime（CLOCK_MONOTONIC，＆ts）最多可能需要5-6微秒，而不是平均23纳秒。 It may happen only once per 10000 consequent calls, however it may happen. 每10000次后续调用只能发生一次，但是可能会发生。

If there were no VDSO library, it could be explained. 如果没有VDSO库，则可以解释。 However, VDSO is used for every clock_gettime (I checked it by strace). 但是，VDSO用于每个clock_gettime（我通过strace检查了它）。

No matter if corresponding thread is affined to certain CPU core, or not. 不管相应的线程是否绑定到某些CPU内核，无论是否绑定。 No matter, if this CPU core isolated from OS, or not. 不管此CPU内核是否与OS隔离。 It means test app may run on exclusive CPU core, while lag may appear anyway! 这意味着测试应用程序可能在专用的CPU内核上运行，而无论如何都会出现延迟！

I'm measuring latency by comparing results of two consequent clock_gettime calls, like: 我通过比较两个随后的clock_gettime调用的结果来测量延迟，例如：

unsigned long long __gettimeLatencyNs() {
    struct timespec t1_ts;
    struct timespec t2_ts;
    clock_gettime(CLOCK_MONOTONIC, &t1_ts);
    clock_gettime(CLOCK_MONOTONIC, &t2_ts);
    return ((t2_ts.tv_sec - t1_ts.tv_sec)*NANO_SECONDS_IN_SEC + t2_ts.tv_nsec - t1_ts.tv_nsec);
}

Could anyone share some ideas, what could be wrong there? 谁能分享一些想法，那里可能出什么问题了？

Answer 1

Let's look at the source code for clock_gettime : 让我们看一下clock_gettime的源代码：

/* Code size doesn't matter (vdso is 4k anyway) and this is faster. */
notrace static int __always_inline do_realtime(struct timespec *ts)
{
    unsigned long seq;
    u64 ns;
    int mode;

    do {
        seq = gtod_read_begin(gtod);
        mode = gtod->vclock_mode;
        ts->tv_sec = gtod->wall_time_sec;
        ns = gtod->wall_time_snsec;
        ns += vgetsns(&mode);
        ns >>= gtod->shift;
    } while (unlikely(gtod_read_retry(gtod, seq)));

    ts->tv_sec += __iter_div_u64_rem(ns, NSEC_PER_SEC, &ns);
    ts->tv_nsec = ns;

    return mode;
}

What we see here is that the code runs inside a loop. 我们在这里看到的是代码在循环内运行。 This loop is annotated with an unlikely condition. 该循环用unlikely情况注释。 The condition has to do with the fact that this code reads shared memory that is sometimes updated, and while it is being updated, the code needs to wait for the update to complete. 该条件与以下事实有关：此代码读取有时已更新的共享内存，并且在对其进行更新时，该代码需要等待更新完成。

The most likely answer to your question, then, is that every so often you catch clock_gettime while the corresponding kernel code is updating its structures. 那么，最有可能回答您的问题的是，您经常会在相应的内核代码更新其结构时捕获clock_gettime 。 When that happens, the code runs significantly slower. 发生这种情况时，代码运行会明显变慢。

Answer 2

I don't think it's the logic of the clock_gettime call itself that is periodically taking longer, but rather than your timing loop is periodically being interrupted, and this extra time shows up as an extra long interval. 我认为不是clock_gettime调用本身的逻辑会定期花费更长的时间，而是您的定时循环会被周期性地中断，而这种额外的时间却显示为一个较长的间隔。

That is, any type of timing loop is subject to being interrupted by external events, such as interrupts. 也就是说，任何类型的定时循环都容易受到外部事件（例如中断）的干扰。 For example, except with a very specific tickless kernel configuration (not the default), your application will be interrupted periodically by the clock interrupt, which will do a bit of processing to see if another process should run. 例如，除了具有非常特定的无滴答内核配置（不是默认配置）之外，您的应用程序将被时钟中断周期性地中断，这将进行一些处理以查看是否应运行另一个进程。 Even if no other process ultimately ends up running, this could easily account for a few microseconds. 即使最终没有其他进程最终运行，这也很容易占到几微秒。

In addition, the hardware may temporarily pause for a variety of reasons, such as frequency transitions that occur when other cores enter or leave the idle state. 另外，硬件由于各种原因可能会暂时暂停，例如在其他内核进入或退出空闲状态时发生的频率转换。 I have measured these transitions at around 8 microseconds, close to the value you report. 我在大约8微秒内测量了这些转变，接近您报告的值。 During these pauses, the CPU isn't executing instructions, but the TSC keeps running, so it shows up as an extra-long interval. 在这些暂停期间，CPU未执行指令，但是TSC保持运行，因此显示为超长间隔。

Beyond that, there are a ton of reasons why you would experience outlier timings. 除此之外，还有很多原因会导致您遇到异常的时间安排。 That answer also includes ways in which you could narrow down the possible reasons if it interests you. 该答案还包括一些方法，您可以根据自己的喜好来缩小可能的原因。

Finally, the answer suggestions that clock_gettime itself may be blocking, while the kernel updates the data structure. 最后，答案建议clock_gettime本身可能正在阻塞，而内核会更新数据结构。 While it's certainly possible, I think it's less likely than the other reasons. 虽然肯定有可能，但我认为它的可能性比其他原因要小。 You could copy and paste the VDSO code, and then modify it to record if any blocking actually happened, and call that to see if your pauses correlate with blocking. 您可以复制并粘贴VDSO代码，然后对其进行修改以记录是否确实发生了阻塞，然后调用该代码以查看您的暂停是否与阻塞相关。 I would guess not. 我猜不会。

即使使用VDSO，clock_gettime也可能非常慢

问题描述

2 个解决方案

解决方案1
4 2017-08-24 15:10:06

解决方案2
0 2018-12-08 19:02:56

即使使用VDSO，clock_gettime也可能非常慢

问题描述

2 个解决方案

解决方案1 4 2017-08-24 15:10:06

解决方案2 0 2018-12-08 19:02:56

解决方案1
4 2017-08-24 15:10:06

解决方案2
0 2018-12-08 19:02:56