简体   繁体   English

C ++,OpenCV:在Windows上读取包含非ASCII字符的文件的最快方法

[英]C++, OpenCV: Fastest way to read a file containing non-ASCII characters on windows

I am writing a program using OpenCV that shall work on Windows as well as on Linux. 我正在使用OpenCV编写程序,该程序可以在Windows上运行,也可以在Linux上运行。 Now the problem with OpenCV is, that its cv::imread function can not handle filepaths that contain non-ASCII characters on Windows. 现在OpenCV的问题是,它的cv::imread函数无法处理Windows上包含非ASCII字符的文件路径。 A workaround is to first read the file into a buffer using other libraries (for example std-libraries or Qt) and then read the file from that buffer using the cv::imdecode function. 解决方法是首先使用其他库(例如std-libraries或Qt)将文件读入缓冲区,然后使用cv::imdecode函数从该缓冲区中读取文件。 This is what I currently do. 这就是我目前所做的。 However, it's not very fast and much slower than just using cv::imread . 但是,它不是非常快,而且比使用cv::imread慢得多。 I have a TIF image that is about 1GB in size. 我的TIF图像大小约为1GB。 Reading it with cv::imread takes approx. cv::imread读它需要大约。 1s, reading it with the buffer method takes about 14s. 1s,用缓冲方法读取它需要大约14s。 I assume that imread just reads those parts of the TIF that are necessary for displaying the image (no layers etc.). 我假设imread只是读取显示图像所需的TIF部分(没有层等)。 Either this, or my code for reading a file into a buffer is bad. 无论是这个,还是我将文件读入缓冲区的代码都不好。

Now my question is if there is a better way to do it. 现在我的问题是,是否有更好的方法来做到这一点。 Either a better way with regard to OpenCV or a better way with regard to reading a file into a buffer. 要么是关于OpenCV的更好方法,要么是关于将文件读入缓冲区的更好方法。

I tried two different methods for the buffering, one using the std libraries and one using Qt (actually they both use QT for some things). 我尝试了两种不同的缓冲方法,一种使用std库,一种使用Qt(实际上它们都使用QT进行某些操作)。 They both are equally slow.: 它们都同样慢:

Method 1 方法1

std::shared_ptr<std::vector<char>> readFileIntoBuffer(QString const& path) {

#ifdef Q_OS_WIN
    std::ifstream file(path.toStdWString(), std::iostream::binary);
#else
    std::ifstream file(path.toStdString(), std::iostream::binary);
#endif
    if (!file.good()) {
        return std::shared_ptr<std::vector<char>>(new std::vector<char>());
    }
    file.exceptions(std::ifstream::badbit | std::ifstream::failbit | std::ifstream::eofbit);
    file.seekg(0, std::ios::end);
    std::streampos length(file.tellg());
    std::shared_ptr<std::vector<char>> buffer(new std::vector<char>(static_cast<std::size_t>(length)));
    if (static_cast<std::size_t>(length) == 0) {
        return std::shared_ptr<std::vector<char>>(new std::vector<char>());
    }
    file.seekg(0, std::ios::beg);
    try {
        file.read(buffer->data(), static_cast<std::size_t>(length));
    } catch (...) {
        return std::shared_ptr<std::vector<char>>(new std::vector<char>());
    }
    file.close();
    return buffer;
}

And then for reading the image from the buffer: 然后从缓冲区读取图像:

std::shared_ptr<std::vector<char>> buffer = utility::readFileIntoBuffer(path);
cv::Mat image = cv::imdecode(*buffer, cv::IMREAD_UNCHANGED);

Method 2 方法2

QByteArray readFileIntoBuffer(QString const & path) {
    QFile file(path);
    if (!file.open(QIODevice::ReadOnly)) {
        return QByteArray();
    }
    return file.readAll();
}

And for decoding the image: 并用于解码图像:

QByteArray buffer = utility::readFileIntoBuffer(path);
cv::Mat matBuffer(1, buffer.size(), CV_8U, buffer.data());
cv::Mat image = cv::imdecode(matBuffer, cv::IMREAD_UNCHANGED);

UPDATE UPDATE

Method 3 方法3

This method maps the file into memory using QFileDevice::map and then uses cv::imdecode . 此方法使用QFileDevice::map将文件映射到内存中,然后使用cv::imdecode

            QFile file(path);
            file.open(QIODevice::ReadOnly);
            unsigned char * fileContent = file.map(0, file.size(), QFileDevice::MapPrivateOption);
            cv::Mat matBuffer(1, file.size(), CV_8U, fileContent);
            cv::Mat image = cv::imdecode(matBuffer, cv::IMREAD_UNCHANGED);

However, also this approach didn't result in a shorter time than the other two. 然而,这种方法也没有比其他两种方法更短的时间。 I also did some time measurements and found that reading the file in the memory or mapping it to the memory is actually not the bottleneck. 我还做了一些时间测量,发现在内存中读取文件或将其映射到内存实际上并不是瓶颈。 The operation that takes the majority of the time is the cv::imdecode . 占用大部分时间的操作是cv::imdecode I don't know why this is the case, since using cv::imread with the same image only takes a fraction of the time. 我不知道为什么会这样,因为使用相同图像的cv::imread只需要一小部分时间。

Potential Workaround 潜在的解决方法

I tried obtaining an 8.3 pathname on Windows for files that contain non-ascii characters using the following code: 我尝试使用以下代码在Windows上获取包含非ascii字符的文件的8.3路径名:

QString getShortPathname(QString const & path) {
#ifndef Q_OS_WIN
    return QString();
#else
    long length = 0;
    WCHAR* buffer = nullptr;
    length = GetShortPathNameW(path.toStdWString().c_str(), nullptr, 0);
    if (length == 0) return QString();
    buffer = new WCHAR[length];
    length = GetShortPathNameW(path.toStdWString().c_str(), buffer, length);
    if (length == 0) {
        delete[] buffer;
        return QString();
    }
    QString result = QString::fromWCharArray(buffer);
    delete[] buffer;
    return result;
#endif
}

However, I had to find out that 8.3 pathname generation is disabled on my machine, so it potentially is on others as well. 但是,我必须发现在我的机器上禁用了8.3路径名生成,所以它也可能在其他机器上。 So I wasn't able to test this yet and it does not seem to provide a reliable workaround. 所以我还没能测试它,它似乎没有提供可靠的解决方法。 I also have the problem that the function doesn't tell me that 8.3 pathname generation is disabled. 我还有一个问题,该函数没有告诉我8.3路径名生成被禁用。

There is an open ticket on this in OpenCV GitHub: https://github.com/opencv/opencv/issues/4292 OpenCV GitHub上有一张开放票: https//github.com/opencv/opencv/issues/4292

One of the comments there suggest a workaround without reading the whole file to memory by using memory-mapped file (with help from Boost): 其中一条评论提出了一种解决方法,即使用内存映射文件(在Boost的帮助下)不将整个文件读取到内存中:

mapped_file map(path(L"filename"), ios::in);
Mat file(1, numeric_cast<int>(map.size()), CV_8S, const_cast<char*>(map.const_data()), CV_AUTOSTEP);
Mat image(imdecode(file, 1));

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM