简体   繁体   English

测量N倍于代码块执行时间的问题

[英]Problem measuring N times the execution time of a code block

EDIT: I just found my problem after writing this long post explaining every little detail... If someone can give me a good answer on what I'm doing wrong and how can I get the execution time in seconds (using a float with 5 decimal places or so), I'll mark that as accepted. 编辑:我写了这篇长篇文章,解释了每个小细节后,才发现我的问题...如果有人可以给我关于我做错了什么的好答案,以及如何以秒为单位获取执行时间(使用5的浮点数小数点左右),我将其标记为接受。 Hint: The problem was on how I interpreted the clock_getttime() man page. 提示:问题在于我如何解释clock_getttime()手册页。

Hi, 你好

Let's say I have a function named myOperation that I need to measure the execution time of. 假设我有一个名为myOperation的函数,该函数需要测量其执行时间。 To measure it, I'm using clock_gettime() as it was recommend here in one of the comments. 来衡量它,我使用clock_gettime()因为它是推荐这里的评论之一。

My teacher recommends us to measure it N times so we can get an average, standard deviation and median for the final report. 我的老师建议我们对它进行N次测量,以便获得最终报告的平均值,标准差和中位数。 He also recommends us to execute myOperation M times instead of just one. 他还建议我们执行myOperation M次而不是仅仅执行一次。 If myOperation is a very fast operation, measuring it M times allow us to get a sense of the "real time" it takes; 如果myOperation是一个非常快速的操作,则对其进行M次测量可以使我们对它所花费的“实时”有所了解; cause the clock being used might not have the required precision to measure such operation. 因为所使用的时钟可能不具有测量此类操作所需的精度。 So, execution myOperation only one time or M times really depends if the operation itself takes long enough for the clock precision we are using. 因此,仅执行一次myOperation或执行M次实际上取决于操作本身是否花费了足够的时间来达到我们正在使用的时钟精度。

I'm having trouble dealing with that M times execution. 我在处理M次执行时遇到麻烦。 Increasing M decreases (a lot) the final average value. M增加会(最终)降低最终平均值。 Which doesn't make sense to me. 这对我来说没有意义。 It's like this, on average you take 3 to 5 seconds to travel from point A to B. But then you go from A to B and back to A 5 times (which makes it 10 times, cause A to B is the same as B to A) and you measure that. 就是这样,从A点到B点平均需要3到5秒钟。但是,从A到B再回到A 5次(这使它变成10倍,原因是A到B与B相同)。到A),然后进行测量。 Than you divide by 10, the average you get is supposed to be the same average you take traveling from point A to B, which is 3 to 5 seconds. 乘以10所得的平均值应该与从A点到B点的平均值相同,即3到5秒。

This is what I want my code to do, but it's not working. 这是我希望我的代码执行的操作,但是它不起作用。 If I keep increasing the number of times I go from A to B and back A, the average will be lower and lower each time, it makes no sense to me. 如果我不断增加从A到B再回到A的次数,那么每次的平均值都会越来越低,这对我来说毫无意义。

Enough theory, here's my code: 理论足够,这是我的代码:

#include <stdio.h>
#include <time.h>

#define MEASUREMENTS 1
#define OPERATIONS   1

typedef struct timespec TimeClock;

TimeClock diffTimeClock(TimeClock start, TimeClock end) {
    TimeClock aux;

    if((end.tv_nsec - start.tv_nsec) < 0) {
        aux.tv_sec = end.tv_sec - start.tv_sec - 1;
        aux.tv_nsec = 1E9 + end.tv_nsec - start.tv_nsec;
    } else {
        aux.tv_sec = end.tv_sec - start.tv_sec;
        aux.tv_nsec = end.tv_nsec - start.tv_nsec;
    }

    return aux;
}

int main(void) {
    TimeClock sTime, eTime, dTime;
    int i, j;

    for(i = 0; i < MEASUREMENTS; i++) {
        printf(" » MEASURE %02d\n", i+1);

        clock_gettime(CLOCK_REALTIME, &sTime);

        for(j = 0; j < OPERATIONS; j++) {
            myOperation();
        }

        clock_gettime(CLOCK_REALTIME, &eTime);

        dTime = diffTimeClock(sTime, eTime);

        printf("   - NSEC (TOTAL): %ld\n", dTime.tv_nsec);
        printf("   - NSEC (OP): %ld\n\n", dTime.tv_nsec / OPERATIONS);
    }

    return 0;
}

Notes: The above diffTimeClock function is from this blog post . 注意:上面的diffTimeClock函数来自此博客文章 I replaced my real operation with myOperation() because it doesn't make any sense to post my real functions as I would have to post long blocks of code, you can easily code a myOperation() with whatever you like to compile the code if you wish. 我用myOperation()替换了我的真实操作,因为发布我的真实函数没有任何意义,因为我不得不发布较长的代码块,如果需要,您可以轻松地编写myOperation()并进行编译你希望。

As you can see, OPERATIONS = 1 and the results are: 如您所见, OPERATIONS = 1 ,结果为:

 » MEASURE 01
   - NSEC (TOTAL): 27456580
   - NSEC (OP): 27456580

For OPERATIONS = 100 the results are: 对于OPERATIONS = 100 ,结果为:

 » MEASURE 01
   - NSEC (TOTAL): 218929736
   - NSEC (OP): 2189297

For OPERATIONS = 1000 the results are: 对于OPERATIONS = 1000 ,结果为:

 » MEASURE 01
   - NSEC (TOTAL): 862834890
   - NSEC (OP): 862834

For OPERATIONS = 10000 the results are: 对于OPERATIONS = 10000 ,结果为:

 » MEASURE 01
   - NSEC (TOTAL): 574133641
   - NSEC (OP): 57413

Now, I'm not a math wiz, far from it actually, but this doesn't make any sense to me whatsoever. 现在,我不是一个数学天才,实际距离还很远,但这对我来说毫无意义。 I've already talked about this with a friend that's on this project with me and he also can't understand the differences. 我已经和一个与我一起参与这个项目的朋友谈论过这个问题,他也无法理解它们之间的区别。 I don't understand why the value is getting lower and lower when I increase OPERATIONS . 我不明白为什么增加OPERATIONS价值会越来越低。 The operation itself should take the same time (on average of course, not the exact same time), no matter how many times I execute it. 该操作本身应该花费相同的时间(当然,平均来说,不是完全相同的时间),无论我执行多少次。

You could tell me that that actually depends on the operation itself, the data being read and that some data could already be in the cache and bla bla, but I don't think that's the problem. 您可以告诉我,这实际上取决于操作本身,所读取的数据以及某些数据可能已经在高速缓存中,等等,但是我不认为这是问题所在。 In my case, myOperation is reading 5000 lines of text from an CSV file, separating the values by ; 就我而言, myOperation正在从CSV文件读取5000行文本,并用;分隔值; and inserting those values into a data structure. 并将这些值插入数据结构。 For each iteration, I'm destroying the data structure and initializing it again. 对于每次迭代,我都会破坏数据结构并再次对其进行初始化。

Now that I think of it, I also that think that there's a problem measuring time with clock_gettime() , maybe I'm not using it right. 现在我想到了,我还认为使用clock_gettime()测量时间存在问题,也许我没有正确使用它。 I mean, look at the last example, where OPERATIONS = 10000 . 我的意思是,看最后一个示例,其中OPERATIONS = 10000 The total time it took was 574133641ns, which would be roughly 0,5s; 它花费的总时间为574133641ns,大约为0.5s。 that's impossible, it took a couple of minutes as I couldn't stand looking at the screen waiting and went to eat something. 那是不可能的,花了几分钟,因为我不能忍受看着屏幕等待着去吃点东西。

Looks like the TimeClock type has two fields, one for seconds and one for nanoseconds. 看起来TimeClock类型有两个字段,一个字段用于秒,另一个字段用于纳秒。 It doesn't make sense to just divide the nanosec field with the number of operations. 仅将纳秒级场除以操作数是没有意义的。 You need to divide the total time. 您需要除以总时间。

If you are using a POSIX system where there is gettimeofday() function you can use something like this to get the current time in microseconds: 如果您使用的是带有gettimeofday()函数的POSIX系统,则可以使用类似这样的方法来获取当前时间(以微秒为单位):

long long timeInMicroseconds(void) {
    struct timeval tv;

    gettimeofday(&tv,NULL);
    return (((long long)tv.tv_sec)*1000000)+tv.tv_usec;
}

The reason why this is very handy is that in order to compute how much your function took you need to do just this: 这非常方便的原因是,为了计算函数占用的空间,您需要执行以下操作:

long long start = timeInMicroseconds();
... do your task N times ...
printf("Total microseconds: %lld", timeInMicroseconds()-start);

So you don't have to deal with two integers, one with seconds and one with microseconds. 因此,您不必处理两个整数,一个处理秒数,一个处理微秒。 Adding and subtracting times will work in a obvious way. 增加和减少时间将以明显的方式起作用。

You just need to change your diffTimeClock() function to return the number of seconds difference, as a double : 您只需要更改diffTimeClock()函数即可将差值秒数返回为double

double diffTimeClock(TimeClock start, TimeClock end) {
    double diff;

    diff = (end.tv_nsec - start.tv_nsec) / 1E9;
    diff += (end.tv_sec - start.tv_sec);

    return diff;
}

and in the main routine change dTime to a double , and the printfs to suit: 并在主例程dTime更改为double ,并将printfs更改为适合:

printf("   - SEC (TOTAL): %f\n", dTime);
printf("   - SEC (OP): %f\n\n", dTime / OPERATIONS);

I generally use the time() function for this. 我通常为此使用time()函数。 It shows wall clock time, but that's really what I care about in the end. 它显示了挂钟时间,但实际上这是我最后要关心的。

One gotcha with performance testing is the operating system may cache file system related operations. 性能测试的一个难题是操作系统可能会缓存文件系统相关的操作。 So the second (and later) runs can be much faster than the first run. 因此,第二次(及以后)运行可能比第一次运行快得多。 You generally need to test may operations and average the result to get a good feel for the results of any changes you make. 通常,您需要测试可能的操作并取平均结果,以便对所做的任何更改的结果有良好的感觉。 There are so many variables this can help you filter out the noise. 变量太多,可以帮助您滤除噪声。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM