简体   繁体   English

如何从 cuda::convolution 函数测量 fft 和 ifft 时间?

[英]How to measure fft and ifft time from the cuda::convolution function?

I am using the cuda::convolution::convolve to calculate the Gaussian convolution and I want to measure the time of the fft and ifft.我正在使用 cuda::convolution::convolve 来计算高斯卷积,我想测量 fft 和 ifft 的时间。 But I don't know how to measure.但我不知道如何衡量。

I found the source code on the GitHub .我在GitHub 上找到了源代码。 I have no idea how to measure the time from it.我不知道如何测量它的时间。

 cufftSafeCall( cufftExecR2C(planR2C, templ_block.ptr<cufftReal>(), templ_spect.ptr<cufftComplex>()) );

        // Process all blocks of the result matrix
        for (int y = 0; y < result.rows; y += block_size.height)
        {
            for (int x = 0; x < result.cols; x += block_size.width)
            {
                Size image_roi_size(std::min(x + dft_size.width, image.cols) - x,
                                    std::min(y + dft_size.height, image.rows) - y);
                GpuMat image_roi(image_roi_size, CV_32F, (void*)(image.ptr<float>(y) + x),
                                 image.step);
                cuda::copyMakeBorder(image_roi, image_block, 0, image_block.rows - image_roi.rows,
                                    0, image_block.cols - image_roi.cols, 0, Scalar(), _stream);

                cufftSafeCall(cufftExecR2C(planR2C, image_block.ptr<cufftReal>(),
                                           image_spect.ptr<cufftComplex>()));
                cuda::mulAndScaleSpectrums(image_spect, templ_spect, result_spect, 0,
                                          1.f / dft_size.area(), ccorr, _stream);
                cufftSafeCall(cufftExecC2R(planC2R, result_spect.ptr<cufftComplex>(),
                                           result_data.ptr<cufftReal>()));

                Size result_roi_size(std::min(x + block_size.width, result.cols) - x,
                                     std::min(y + block_size.height, result.rows) - y);
                GpuMat result_roi(result_roi_size, result.type(),
                                  (void*)(result.ptr<float>(y) + x), result.step);
                GpuMat result_block(result_roi_size, result_data.type(),
                                    result_data.ptr(), result_data.step);

                result_block.copyTo(result_roi, _stream);
            }
        }

        cufftSafeCall( cufftDestroy(planR2C) );
        cufftSafeCall( cufftDestroy(planC2R) );

        syncOutput(result, _result, _stream);
    }
}

I once had to measure and do it like this:我曾经不得不测量并这样做:

#include <chrono>

auto begin = std::chrono::high_resolution_clock::now();

cufftSafeCall(cufftExecR2C(planR2C, image_block.ptr<cufftReal>(),
                                           image_spect.ptr<cufftComplex>()));
//or the call you want to measure

auto elapsed = chrono::high_resolution_clock::now() - begin;

Then you can transform it, for example to microseconds, using: time = chrono::duration_cast<chrono::microseconds>(elapsed).count();然后你可以将它转换为微秒,例如: time = chrono::duration_cast<chrono::microseconds>(elapsed).count();

If the call is inside a for loop and you want the times of all the calls you can declare an array to save the time each round.如果调用在 for 循环内并且您想要所有调用的时间,则可以声明一个数组以节省每一轮的time

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM