简体   繁体   English

Boost:iostreams stream解压到达EOF后如何重新启动读取?

[英]How to re-inititate read after reaching EOF during stream decompression with Boost:iostreams?

I am trying to realize a streaming de-compressor with Boost:iostreams that could work with incomplete compressed files (the size of the uncompressed file is known before the decompression starts).我正在尝试使用 Boost:iostreams 实现流式解压缩器,它可以处理不完整的压缩文件(在解压缩开始之前已知未压缩文件的大小)。 Basically, I run the compressor and decompressor simultaneously and since compressor is slower than decompressor, decompressor reaches the end of file.基本上,我同时运行压缩器和解压器,由于压缩器比解压器慢,解压器到达文件末尾。 I am trying to reset the stream to re-initiate the read operation but I could not realize it.我试图重置 stream 以重新启动读取操作,但我没有意识到。 gcount() still returns 0 after clear() and seekg(0) . gcount()clear()seekg(0)之后仍然返回 0。 My ultimate goal is to realize a mechanism that would continue from the point where the end of file is reached, instead of returning to the beginning.我的最终目标是实现一种机制,该机制将从到达文件末尾的位置继续,而不是返回到开头。 But, I cannot even return to the beginning of the file.但是,我什至无法返回到文件的开头。

I would appreciate any kind of support.我将不胜感激任何形式的支持。 Thank you in advance.先感谢您。

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>

#include <boost/iostreams/filtering_streambuf.hpp>
#include <boost/iostreams/copy.hpp>
#include <boost/iostreams/filter/gzip.hpp>
#include <boost/iostreams/filtering_stream.hpp>

const std::size_t bufferSize = 1024;
const std::size_t testDataSize = 13019119616; 

int main() {

    // Decompress
    std::ofstream outStream("image_boost_decompressed.img", std::ios_base::out);
    std::ifstream inStream("image_boost_compressed.img.gz", std::ios_base::in | std::ios_base::binary);
    
    boost::iostreams::filtering_istream out;
    out.push(boost::iostreams::gzip_decompressor());
    out.push(inStream);

    char buf[bufferSize] = {};

    std::cout << "Decompression started!" << std::endl;

    std::size_t loopCount = 0;
    std::size_t decompressedDataSize = 0;

    while(decompressedDataSize < testDataSize) {
        std::cout << "cursor bef: " << inStream.tellg() << std::endl; 

        out.read(buf, bufferSize);

        std::cout << "read size: " << out.gcount() << std::endl;
        std::cout << "cursor after: " << inStream.tellg() << std::endl; 

        if (out.gcount() > 0) {
            outStream.write(buf, out.gcount());
            decompressedDataSize = decompressedDataSize + out.gcount();
        } else if (out.gcount() == 0) {
            std::cout << "clear initiated!" << std::endl;
            inStream.clear();
            inStream.seekg(0)
        }
        std::cout << "----------------" << std::endl;
    }

    std::cout << "Decompression ended!" << std::endl;
    std::cout << "decompressed data size: " << decompressedDataSize << std::endl;
    outStream.close();

    return 0;
}


Basically, I run the compressor and decompressor simultaneously and since compressor is slower than decompressor, decompressor reaches the end of file基本上,我同时运行压缩器和解压器,由于压缩器比解压器慢,解压器到达文件末尾

In your code you're NOT running a compressor.在您的代码中,您没有运行压缩机。 It's not the slowness of the compressor that causes your program to see EOF.导致您的程序看到 EOF 的不是压缩器的缓慢。 Instead, your EOF is caused by the fact that you actually reach the end of the file.相反,您的 EOF 是由您实际到达文件末尾这一事实引起的。

This means you have a race-condition where you access the file early.这意味着您有一个竞争条件,您可以提前访问文件。

  1. If your aim is to use the "file" only as a temporary station during fully streaming operations, the usual way to approach this is to use a (named) pipe (FIFO on POSIX platforms) instead of a file.如果您的目标是在完全流式传输操作期间仅将“文件”用作临时站,则通常的方法是使用(命名的)pipe(POSIX 平台上的 FIFO)而不是文件。

  2. If you cannot do that, the simplest fix is to make sure you start processing files only when they are complete.如果您不能这样做,最简单的解决方法是确保仅在文件完成时才开始处理文件。 The usual way to accomplish this is by doing "transactional" uploads (meaning to upload into a temporary location, and then rename the file into place only after completion).完成此操作的通常方法是执行“事务性”上传(意味着上传到临时位置,然后仅在完成后将文件重命名到位)。

Both ways, your program will correctly see the EOF only when the writing side closes their end.这两种方式,你的程序只有在写入端结束时才能正确地看到 EOF。

I have some examples of similar approaches on this site, eg this.networked example that pipes to zcat to do the streaming decompression .我在这个网站上有一些类似方法的例子,例如this.networked example that pipes to zcat to do the streaming decompression

If you want to pick up where you left off, then use seekg(0, std::ios_base::cur) .如果您想从上次停下的地方继续,请使用seekg(0, std::ios_base::cur) It works:有用:

#include <iostream>
#include <fstream>

int main() {
    std::ofstream out("test.out");
    out << "line 1\n";
    out.flush();
    std::ifstream in("test.out");
    char line[256];
    in.read(line, sizeof(line));
    line[in.gcount()] = 0;
    std::cout << line;
    if (in.eof())
        std::cout << "-- at eof\n";
    out << "line 2\n";
    out.flush();
    in.clear();
    if (in.good())
        std::cout << "-- now good!\n";
    in.seekg(0, std::ios_base::cur);
    in.read(line, sizeof(line));
    line[in.gcount()] = 0;
    std::cout << line;
    in.close();
    out.close();
}

As for the decompressor, you don't want to let it see an end-of-input indicator.至于解压缩器,您不想让它看到输入结束指示符。 Run the decompressor separately, and provide it only what you have read so far.单独运行解压缩程序,并只提供您目前已阅读的内容。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM