简体   繁体   English

GPROF分析工具执行时间不正确

[英]GPROF profiling tool is inaccurate execution time

I tried to test my cpp code using gprof on ubuntu. 我试图在ubuntu上使用gprof测试我的cpp代码。

But I found some bug. 但是我发现了一些错误。

When gprof calculates execution time, the minimum time unit is 0.01 seconds. 当gprof计算执行时间时,最小时间单位为0.01秒。

For example, if execution time of my function in my program is 0.001 or even more faster, gprof recognizes as 0 seconds. 例如,如果我的函数在程序中的执行时间为0.001甚至更快,则gprof会识别为0秒。

Even if i execute my function thousand time, it calculate like this: 0/s + 0/s …. 即使我执行我的函数一千次,它也会这样计算:0 / s + 0 / s…。 + 0/s = 0/s + 0 /秒= 0 /秒

but real running time is 1 seconds… 但是实际运行时间是1秒…

So, I want to know how to modify the minimum time unit or calculate exact execution time. 因此,我想知道如何修改最小时间单位或计算确切的执行时间。

Please help me :) 请帮我 :)

And i don't need any recommendation of other profiling tool 而且我不需要任何其他分析工具的建议

This question is almost a duplicate of inaccuracy in gprof output , but with a minor difference: it looks like it tries to find the performance bottleneck in the wrong place: 这个问题几乎是gprof输出不准确之处的重复,但有一点点不同:看起来它试图在错误的地方找到性能瓶颈:

Even if i execute my function thousand time, it calculate like this: 0/s + 0/s …. 即使我执行我的函数一千次,它也会这样计算:0 / s + 0 / s…。 + 0/s = 0/s + 0 /秒= 0 /秒

This is not how gprof works. 这不是gprof的工作方式。 Gprof samples the program counter once in T (normally 0.01 seconds). Gprof每隔T(通常为0.01秒)对程序计数器采样一次。 It does not just sum up time measurements, but rather relies on statistics. 它不仅可以汇总时间测量结果,还可以依靠统计数据。 The chances are pretty low that a program that takes 1.00 CPU is never sampled out of the approximately 100 samples it should get. 占用1.00个CPU的程序永远不会从应获取的大约100个样本中进行抽样的机会非常低。 80 samples is possible, 120 is possible, 0 is virtually impossible. 可能有80个样本,有120个样本,实际上是0个样本。 So your problem lies elsewhere. 因此,您的问题出在其他地方。

Gprof has many limitations, as can be seen at inaccuracy in gprof output . Gprof有许多限制,如gprof输出中的不准确之处所示 The real problem is either that the time is spent in I/O, has a complicated mutual recursion, in a shared library, or it tries to reuse the same signals that gprof uses to sample the code. 真正的问题是要么时间花费在I / O上,要么在共享库中进行复杂的相互递归,要么尝试重用gprof用于采样代码的相同信号。

If you still insist on changing the sampling rate, then it seems possible in theory but it is too complicated to be worth it. 如果您仍然坚持更改采样率,那么从理论上讲似乎是可行的,但它太复杂了,不值得。 There have been claims that rewriting the profil() or monstartup() functions. 已要求该重写profil()monstartup()函数。 You can override them using linker facilities such as LD_PRELOAD . 您可以使用链接器功能(例如LD_PRELOAD)覆盖它们。 Given the limitations of gprof, this path is not worth while, and I could not see any reference to code that actually did that. 鉴于gprof的局限性,此路径不值得花时间,而且我看不到任何实际执行此操作的代码的引用。

Here is a quote by Nick Clifton on the matter: 这是尼克·克利夫顿对此事的引用:

So your choices are: 因此,您的选择是:

  1. Alter the profil() function in your OS. 在您的操作系统中更改profil()函数。
  2. Write your own monstartup() function and find some other way of generating the time samples. 编写您自己的monstartup()函数,并找到其他生成时间样本的方法。

I have tried to modify the intervals by hacking SIGPROF interval: 我试图通过入侵SIGPROF间隔来修改间隔:

void set_interval(double seconds)                                                                                                                                              
{                                                                                                                                                                              
      if (seconds <= 0)                                                                                                                                                          
          return;                                                                                                                                                                
      itimerval prev, next;                                                                                                                                                      
      next.it_value.tv_sec = (uint64_t) seconds;                                                                                                                                 

      next.it_value.tv_usec = (uint64_t)(1000000 * (seconds - next.it_value.tv_sec));                                                                                            
      next.it_interval = next.it_value;                                                                                                                                          
      setitimer(ITIMER_PROF, &next, &prev);                                                                                                                                      
}

On the Linux I have tried, set_interval(0.1) from main does change the time interval to 1/10 of a second (but reports that wrongly in the gprof output). 在我尝试过的Linux上,来自main的set_interval(0.1)确实将时间间隔更改为1/10秒(但在gprof输出中错误地报告了该时间间隔)。 But running set_interval(0.001) has no effect on my machine, since the finest granularity is set to 10 ms. 但是运行set_interval(0.001)对我的机器没有影响,因为最精细的粒度设置为10 ms。 Anything below 10ms is increased to 10ms internally. 任何低于10ms的值都会在内部增加到10ms。 To overcome this limitation, please read 1ms resolution timer under linux recommended way . 要克服此限制,请在linux建议方式下阅读1ms分辨率计时器

This is getting so ridicules, that I strongly suggest you should give this route up and look for a different profiler, or find why gprof does not work for you as it is. 这太荒谬了,以至于我强烈建议您放弃这条路线,寻找其他探查器,或者找出为什么gprof不能按原样工作的原因。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM