简体   繁体   English


[英]Why function first-time calling costs much more time than the second time calling it and third and so on?

Here is my code based on OpenCV: 这是我基于OpenCV的代码:

int main()
    clock_t start, stop;
    Mat img = imread("lena.jpg", IMREAD_GRAYSCALE);
    img.convertTo(img, CV_32F, 1.0);
    float *imgInP = (float *)img.data;    // get the input data point 
    Mat imgOut = Mat::zeros(Size(img.rows, img.cols), CV_32F);   // create output mat
    float *imgOutP = (float *)imgOutP.data;  // get the output data point

    // test several calling of opencv boxFilter
    start = clock();
    //blur(img, imgOut, Size(31, 31));
    boxFilter(img, imgOut, CV_32F, Size(31, 31));
    stop = clock();
    cout << "BoxFilter on OpenCV 1 : " << 1000.0 * (stop - start) / CLOCKS_PER_SEC << " ms" << endl;
    start = clock();
    //blur(img, imgOut, Size(31, 31));
    boxFilter(img, imgOut, CV_32F, Size(31, 31));
    stop = clock();
    cout << "BoxFilter on OpenCV 2 : " << 1000.0 * (stop - start) / CLOCKS_PER_SEC << " ms" << endl;
     start = clock();
    //blur(img, imgOut, Size(31, 31));
    boxFilter(img, imgOut, CV_32F, Size(31, 31));
    stop = clock();
    cout << "BoxFilter on OpenCV 3 : " << 1000.0 * (stop - start) / CLOCKS_PER_SEC << " ms" << endl;

    return 0;

Here is the Outputs of above program: 以下是上述程序的输出:

BoxFilter on OpenCV 1 : 72.368ms OpenCV 1上的BoxFilter72.368ms

BoxFilter on OpenCV 2 : 0.495 ms OpenCV 2上的BoxFilter: 0.495毫秒

BoxFilter on OpenCV 3 : 0.403 ms OpenCV 3上的BoxFilter: 0.403毫秒

Why the time costed by first calling boxFilter (72.368ms) is much much more than the second (0.495ms) and third one(0.403 ms). 为什么第一次调用boxFilter(72.368ms)所花费的时间比第二次调用(0.495ms)和第三次调用(0.403ms)要多得多

what's more, if I change the input image at the third time calling boxFilter, the outputs didn't change as well. 更重要的是,如果我在第三次调用boxFilter时更改输入图像,输出也不会改变。 So, it is may not be the factor of image data cache... 因此,它可能不是图像数据缓存的因素......

Thanks for any advise. 谢谢你的任何建议。

My system is Ubuntu 14.04, i5-4460, 12GB RAM, OpenCV version : 3.1, cmake Version : 3.2, g++ version : 4.8.4 我的系统是Ubuntu 14.04,i5-4460,12GB RAM,OpenCV版本:3.1,cmake版本:3.2,g ++版本:4.8.4

Below is my cmake file : 下面是我的cmake文件:

cmake_minimum_required(VERSION 3.7)


find_package(OpenCV REQUIRED)

set(SOURCE_FILES main.cpp)
add_executable(boxfilterTest ${SOURCE_FILES})

target_link_libraries(boxfilterTest ${OpenCV_LIBS})

The IDE is CLion. IDE是CLion。

The reason for difference is timing is due to both the instruction cache as well as data cache. 差异的原因是时间是由指令缓存和数据缓存引起的。 The data cache can be verified by forcing the matrix to be re-allocated to a different size (eg resizing the image). 可以通过强制将矩阵重新分配给不同的大小(例如,调整图像大小)来验证数据高速缓存。 If the image is resized between different calls to boxFilter , the execution times of boxFilter calls becomes very close to each other. 如果在对boxFilter不同调用之间调整图像大小,则boxFilter调用的执行时间变得非常接近。 Here is the example code demonstrating the said phenomenon. 以下是演示上述现象的示例代码。

#include <iostream>
#include <opencv2/opencv.hpp>

using namespace std;
using namespace cv;

int main()
    clock_t start, stop;
    Mat img = imread("lena.jpg", IMREAD_GRAYSCALE);
    img.convertTo(img, CV_32F, 1.0);
    float *imgInP = (float *)img.data;    // get the input data point 
    Mat imgOut = Mat::zeros(Size(img.rows, img.cols), CV_32F);   // create output mat
    float *imgOutP = (float *)imgOut.data;  // get the output data point

    // test several calling of opencv boxFilter
    start = clock();
    //blur(img, imgOut, Size(31, 31));
    boxFilter(img, imgOut, CV_32F, Size(31, 31));
    stop = clock();

    cv::resize(img, img, cv::Size(), 1.1, 1.1); //Force data re-allocation

    cout << "BoxFilter on OpenCV 1 : " << 1000.0 * (stop - start) / CLOCKS_PER_SEC << " ms" << endl;
    start = clock();
    //blur(img, imgOut, Size(31, 31));
    //GaussianBlur(img, imgOut, Size(31, 31), 0.5);
    boxFilter(img, imgOut, CV_32F, Size(31, 31));
    stop = clock();

    cv::resize(img, img, cv::Size(), 0.909, 0.909);  //Force data re-allocation

    cout << "BoxFilter on OpenCV 2 : " << 1000.0 * (stop - start) / CLOCKS_PER_SEC << " ms" << endl;
     start = clock();
    //blur(img, imgOut, Size(31, 31));
    boxFilter(img, imgOut, CV_32F, Size(31, 31));
    stop = clock();
    cout << "BoxFilter on OpenCV 3 : " << 1000.0 * (stop - start) / CLOCKS_PER_SEC << " ms" << endl;

    return 0;

Program Output: 节目输出:

Without data re-allocation: 没有数据重新分配:

BoxFilter on OpenCV 1 : 2.459 ms OpenCV 1上的BoxFilter:2.459 ms

BoxFilter on OpenCV 2 : 1.599 ms OpenCV 2上的BoxFilter:1.599 ms

BoxFilter on OpenCV 3 : 1.568 ms OpenCV 3上的BoxFilter:1.568 ms

With data re-allocation: 使用数据重新分配:

BoxFilter on OpenCV 1 : 2.225 ms OpenCV 1上的BoxFilter:2.225 ms

BoxFilter on OpenCV 2 : 2.368 ms OpenCV 2上的BoxFilter:2.368 ms

BoxFilter on OpenCV 3 : 2.091 ms OpenCV 3上的BoxFilter:2.091毫秒

Well, I think it may be caused by the instruction cache (after all, there is * MB L2 cache in CPU). 好吧,我认为它可能是由指令缓存引起的(毕竟,CPU中有* MB L2缓存)。 But I cannot figure out how to verify it and improve it. 但我无法弄清楚如何验证并改进它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM