简体   繁体   English

无法在 OpenCV GPU (CUDA) 中创建过滤器

[英]Fail to create filters in OpenCV GPU (CUDA)

System information (version)系统信息(版本)

  • OpenCV => 3.2 OpenCV => 3.2
  • Operating System / Platform => Windows 10 64 Bit操作系统/平台 => Windows 10 64 位
  • Compiler => Visual Studio 2015 Community编译器 => Visual Studio 2015 社区
  • CUDA Toolkit Version => 8.0 CUDA 工具包版本 => 8.0

Detailed description详细说明

I am using GPU based functions and operations.我正在使用基于 GPU 的功能和操作。 I build OpenCV with CUDA support on my own, and most GPU functions and operations work fine.我自己构建了支持 CUDA 的 OpenCV,大多数 GPU 功能和操作都运行良好。 But when it comes to filter related functions like createGaussianFilter or createSobelFilter the exception below is caught:但是当涉及到像createGaussianFiltercreateSobelFilter这样的过滤器相关函数时,会捕获以下异常:

C:\OpenCV\opencv-3.2.0\modules\cudafilters\src\filtering.cpp:414: error: (-215) rowFilter_:= 0 in function `anonymous-namespace'::SeparableLinearFilter::SeparableLinearFilter

Code to reproduce重现代码

// C++ code example
// A very simple snnipet
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/core/cuda.hpp>
#include <opencv2/cudaimgproc.hpp>
#include <opencv2/cudafilters.hpp>
#include <iostream>

using namespace cv;
using namespace std;

int main(int argc, char** argv)
{
    try
    {
        Ptr<cuda::Filter> filterX = cuda::createSobelFilter(CV_64F, CV_64F, 1, 0, 3, 1, BORDER_DEFAULT); // x direction
    }
    catch (cv::Exception& e)
    {
        const char* err_msg = e.what();
        std::cout << "exception caught: " << err_msg << std::endl;
    }

    return 0;
}

You can find here the code to test the CUDA version of Sober filter.您可以在此处找到用于测试 CUDA 版本的 Sober 过滤器的代码。

In my opinion, this is a choice of the OpenCV developers (the CUDA API allows double precision computation since a good amount of time I think).在我看来,这是 OpenCV 开发人员的选择(CUDA API 允许双精度计算,因为我认为很长一段时间以来)。 CV_64F or double precision floating point is not accepted because of being less efficient and the better precision does not worth the performance drop. CV_64F或双精度浮点不被接受,因为效率较低且精度更高不值得性能下降。 Computer graphics do not need this amount of precision so the GPU architecture has more single precision units (more information here , 2010).计算机图形学不需要这种精度,因此 GPU 架构具有更多的单精度单位(更多信息 2010 年)。

See also the CUDA faq .另请参阅 CUDA常见问题解答

Note: this is especially the case for gaming GPU vs professional GPU (see here , 2015):注意:游戏 GPU 与专业 GPU 尤其如此(参见此处,2015 年):

Summary of NVIDIA GPUs NVIDIA GPU 总结

NVIDIA's GTX series are known for their great FP32 performance but are very poor in their FP64 performance. NVIDIA 的 GTX 系列以其出色的 FP32 性能而闻名,但其 FP64 性能却很差。 The performance generally ranges between 1:24 (Kepler) and 1:32 (Maxwell).性能通常介于 1:24(开普勒)和 1:32(麦克斯韦)之间。 The exceptions to this are the GTX Titan cards which blur the lines between the consumer GTX series and the professional Tesla/Quadro cards.例外情况是 GTX Titan 卡,它模糊了消费类 GTX 系列和专业 Tesla/Quadro 卡之间的界限。

The Kepler architecture Quadro and Tesla series card provide full double precision performance with 1:3 FP32. Kepler 架构的 Quadro 和 Tesla 系列卡以 1:3 FP32 提供完整的双精度性能。 However, with the Quadro M6000, NVIDIA has decided to provide only minimal FP64 performance by giving it only 1:32 of FP32 capability and touting the M6000 as the best graphics card rather than the best graphics+compute card like the Quadro K6000.然而,对于 Quadro M6000,NVIDIA 决定只提供最低限度的 FP64 性能,只提供 1:32 的 FP32 性能,并宣称 M6000 是最好的显卡,而不是像 Quadro K6000 这样的最好的图形+计算卡。

AMD GPUs AMD 显卡

AMD GPUs perform fairly well for FP64 compared to FP32.与 FP32 相比,AMD GPU 在 FP64 方面的表现相当出色。 Most AMD cards (including consumer/gaming series) will give between 1:3 and 1:8 FP32 performance for FP64.大多数 AMD 卡(包括消费类/游戏系列)将为 FP64 提供 1:3 到 1:8 的 FP32 性能。 The AMD Tahiti architectures tested in these benchmarks here do not suffer from the same problems FP64 problems as NVIDIA's GTX series and give a 1:4 performance.此处在这些基准测试中测试的 AMD Tahiti 架构不会遇到与 NVIDIA 的 GTX 系列相同的 FP64 问题,并提供 1:4 的性能。 Newer Hawaii architecture consumer grade GPUs are expected to provide 1:8 performance.较新的 Hawaii 架构消费级 GPU 有望提供 1:8 的性能。

The FirePro W9100, W8100 and S9150 will give you an incredible FP64 1:2 FP32 performance. FirePro W9100、W8100 和 S9150 将为您提供令人难以置信的 FP64 1:2 FP32 性能。

Overall, AMD GPUs hold a reputation for good double precision performance ratios compared to their NVIDIA counterparts.总体而言,与 NVIDIA 同类产品相比,AMD GPU 以出色的双精度性能比而著称。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM