简体   繁体   English

使用TSC(时间戳计数器)计算时间

[英]Time calculation with TSC (Time Stamp Counter)

I am trying to measure the time taken by some code inside Linux kernel at very high accuracy by a Linux kernel module. 我试图通过Linux内核模块以非常高的精度测量Linux内核中的一些代码所花费的时间。

For this purpose, I have tried rdtscl() which gives the number of clock ticks used in the code as given below: 为此,我尝试了rdtscl() ,它给出了代码中使用的时钟周期数,如下所示:

unsigned long ini, end;
rdtscl(ini);
//some code...
rdtscl(end);
printk("time taken=%lu ticks",end-ini);

As I have refered to http://en.wikipedia.org/wiki/Time_Stamp_Counter which says that TSC is a 64-bit register present on all x86 processors since the Pentium . 我已经提到http://en.wikipedia.org/wiki/Time_Stamp_Counter ,它说自从Pentium以来TSC是所有x86处理器上的64位寄存器 So, if I have dual core processor, will this counter be present in both cores or there will be only one since it is only one processor but dual core? 那么,如果我有双核处理器,这个计数器是否会出现在两个核心中,或者只有一个,因为它只有一个处理器而是双核心?

The second question is that: I have Intel Xeon i3 processor which has 4 processors, each of them having 2 cores. 第二个问题是:我有Intel Xeon i3处理器,它有4个处理器,每个处理器有2个核心。 Then, measuring the clock ticks, will give the ticks of single processor or addition of all 4 processors? 然后,测量时钟滴答,将给出单个处理器的滴答或添加所有4个处理器?

If you get NO clock ticks, then there's something seriously wrong with your code. 如果没有时钟滴答,那么你的代码就会出现严重问题。 Did you write your own rdtscl [or copy it from somewhere that isn't a good source?] 您是否编写了自己的rdtscl [或从某个不是很好的来源复制它?]

By the way, modern Intel (and AMD) processors may well have "constant TSC", so a processor that is halted, sleeping, running slower, etc, will still tick away at the same rate as the others - it may not be in sync still, but that's a different matter. 顺便说一句,现代英特尔(和AMD)处理器可能具有“恒定的TSC”,因此停止,睡眠,运行速度较慢的处理器仍将以与其他处理器相同的速率剔除 - 它可能不在同步仍然,但这是另一回事。

Try running just a loop that prints the value from your counter - just the RDTSC instruction itself should take some 30-50 clock cycles, so you should see it moving. 尝试只运行一个从计数器打印值的循环 - 只需要RDTSC指令本身需要大约30-50个时钟周期,所以你应该看到它在移动。

Edit: Here's my rdtsc function: 编辑:这是我的rdtsc功能:

void rdtscl(unsigned long long *ll)
{
    unsigned int lo, hi;
    __asm__ __volatile__ ("rdtsc" : "=a"(lo), "=d"(hi));                        
    *ll = ( (unsigned long long)lo)|( ((unsigned long long)hi)<<32 );  
}

alernatitvely, as a function returning a value: alernatitvely,作为返回值的函数:

unsigned long long rdtscl(void)
{
    unsigned int lo, hi;
    __asm__ __volatile__ ("rdtsc" : "=a"(lo), "=d"(hi));                        
    return ( (unsigned long long)lo)|( ((unsigned long long)hi)<<32 );  
}

I notice that your code doesn't pass a pointer of your unsigned long, which makes me suspect that you are not actually passing the timestamp counter BACK to the caller, but rather just keeping whatever value it happens to have - which may well be the same for both values. 我注意到你的代码没有传递你的unsigned long的指针,这让我怀疑你实际上没有将时间戳计数器BACK传递给调用者,而只是保持它碰巧有的任何值 - 这可能是两个值都相同。

All the cores have their own TSC; 所有内核都有自己的TSC; it basically counts cycles- but beware - the TSC clocks may not be synchronized! 它基本上计算周期 - 但要注意 - TSC时钟可能不同步! if your code starts running on one core and migrates to the 2nd one, which is certainly possible in the general case, your count will be wrong! 如果您的代码开始在一个核心上运行并迁移到第二个核心,这在一般情况下肯定是可能的,那么您的计数将是错误的!

The same WikiPedia article did said about issues with the TSC as below, 同样的WikiPedia文章确实说过TSC的问题,如下所示,

With the advent of multi-core/hyper-threaded CPUs, systems with multiple CPUs, and 
hibernating operating systems, the TSC cannot be relied on to provide accurate results 
— unless great care is taken to correct the possible flaws: rate of tick and whether 
all cores (processors) have identical values in their time-keeping registers. **There 
is no promise that the timestamp counters of multiple CPUs on a single motherboard will 
be synchronized**. In such cases, programmers can only get reliable results by locking 
their code to a single CPU. Even then, the CPU speed may change due to power-saving 
measures taken by the OS or BIOS, or the system may be hibernated and later resumed 
(resetting the time stamp counter). In those latter cases, to stay relevant, the 
counter must be recalibrated periodically (according to the time resolution your 
application requires).

Meaning modern CPU's can alter their CPU clock rate to save power which can affect the TSC value. 这意味着现代CPU可以改变其CPU时钟速率以节省可能影响TSC值的功率。 Also TSC would never increment in situations like, when kernel may execute HALT and stop processor until an external interrupt is received. 此外,在内核可能执行HALT并停止处理器直到收到外部中断的情况下,TSC永远不会增加。

the second question is that i have intel xeon i3 processor which has 4 processors & 
each having 2 cores then measuring the clock ticks will give the ticks of single 
processor or addition of all 4 processors..?

This may lead to a situation where a process could read a time on one processor, move to a second processor and encounter a time earlier than the one it read on the first processor which results in TSC as unstable time source. 这可能导致这样的情况:进程可以在一个处理器上读取时间,移动到第二处理器并且比在第一处理器上读取的时间更早地遇到时间,这导致TSC作为不稳定的时间源。

Some of things mentioned here are accurate like TSC not being a measure of time because of S states in the CPU. 这里提到的一些事情是准确的,就像TSC由于CPU中的S状态而不是时间的度量。 But I think TSC can be used for relative sequencing even in a multi-core environment. 但我认为即使在多核环境中,TSC也可用于相对测序。 There is a flag called TSCInvariant which is set to true in Intel CPUs >= nehalem arch. 有一个名为TSCInvariant的标志,在Intel CPUs> = nehalem arch中设置为true。 In those CPUs the TSC varies at a constant rate on all cores. 在这些CPU中,TSC在所有内核上以恒定速率变化。 Therefore you will never go back in TSC count if you get context switched to a different core. 因此,如果将上下文切换到不同的核心,您将永远不会返回TSC计数。

In Ubuntu you can do sudo apt-get install cpuid 在Ubuntu中你可以做sudo apt-get install cpuid

cpuid | cpuid | grep TscInvariant to verify it in your desktop. grep TscInvariant在桌面上验证它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM