简体   繁体   中英

Fastest way to measure global time (wall-clock) in multithreaded application with C++

I am working on a high-performance library where I need to stop the time point of a thread when it finished a computation and then save that time point in a global variable, so that this global variable always holds the most recent finishing time of a thread.

Right now, I am using the C++ std:chrono library with timestamps to stop the time like:

auto start = std::chrono::high_resolution_clock::now().time_since_epoch();
// thread calculates something
auto finish = std::chrono::high_resolution_clock::now().time_since_epoch();
unsigned time = std::chrono::duration_cast<std::chrono::microseconds>(finish-start).count();
// now I can use the needed time and also update a global variable with the finish time point.

This works pretty well. But...

A call to chrono is slower than a call to the rdtsc().

The rdtsc version:

static __inline__ ticks getticks(void)
{
     unsigned a, d;
     asm("cpuid");
     asm volatile("rdtsc" : "=a" (a), "=d" (d));

     return (((ticks)a) | (((ticks)d) << 32));
}

tick = getticks();
sleep(1); // or whatever calculation
tick1 = getticks();
time = (unsigned)((tick1-tick)/2400000/*The CPU speed*/);

Comparison: I measured both, chrono and rdtsc, calls with the rdtsc itself to see how many ticks they need and the results are:

  • chrono needed about 34096 ticks
  • rdtsc needed about 1744 ticks

Problem:

I can't use rdtsc because it is, as far as I know, relative only. I can't use it to measure time-points, right? I don't want just durations of some calculation but also the actual finishing time point so that every thread know when the most recent finishing time happened.

Question : What is the fastest way to measure global time points and share across all threads?

I can't use rdtsc because it is as far as I know relative only.

It is relative to some unspecified time point, eg CPU power on time.

I can't use it to measure time-points, right?

You use rdtsc to measure durations in CPU cycles. You can also use the value as a time point since unspecified time. You can also find out the wall clock time of that unspecified time.


If you use gcc , __builtin_ia32_rdtsc generates better assembly than hand-coded versions .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM