简体   繁体   中英

How to compress multiple buffers in memory with boost into one and get its compressed size?

I want to compress multiple buffers (in my case video frames from different sources) into one new buffer via boost's zlib compression and then, later, write everything into a file on the disk. I need these two steps because I want to prepend a header in the file which contains the final size of the compressed buffers (this will later function as offset for the parser). I want to achieve this with boost's iostreams library.

The following related questions arised:

a) Do I need to use filtering_stream of filtering_streambuf ? I would expect for the latter to have some kind of buffer behavior already.

b) How can I close the filtering_stream(buf) and write it to a buffer?

c) How can I read the final size of the compressed data? .tellg() is not implemented for these filtering_streams (as mentioned somewhereelse on SO)

d) Can you have multiple sources, ie my three buffers or do I need to combine them? (see below for my approach).

class Frame {
private:
    /* other things */
public:
    float buf1[3];
    float buf2[3];
    float buf3[4];
    /* more things */
};

int main() {
    Frame frame;
    
    using boost::iostreams bio;
    
    bio::filtering_streambuf<bio::input> in;
    in.push(bio::gzip_compressor());
    /* Could you also add the buffers indiviually? */
    in.push(bio::array_source(reinterpret_cast<const char*>(frame.buf1), 3 + 7 + 12 + (sizeof(float) * 3)));
    
    const char *compressed = /* How to close in and write the contents to this buffer? */
    int compressedSize = /* How to get this from in? in.tellg() does not work */
    
    std::stringstream headerInformation;
    headerInformation << "START";
    headerInformation << "END " << compressedSize;
    
    std::ofstream ofs("ouput.data", std::ofstream::out | std::ofstream::binary | std::ofstream::app);
    bio::filtering_ostream out;
    out.push(ofs);
    out.write(headerInformation.str(), headerInformation.str().length());
    out.write(compressed, compressedSize);
    
    boost::iostreams::close(out);
    boost::iostreams::close(in);
    
    return 0;
}

a) Do I need to use filtering_stream of filtering_streambuf? I would expect for the latter to have some kind of buffer behavior already.

Both would work. The stream adds text and locale features like in the standard library.

b) How can I close the filtering_stream(buf) and write it to a buffer?

You could use an array_sink , back_inserter_device , memory map etc. See https://www.boost.org/doc/libs/1_72_0/libs/iostreams/doc/ ("Models").

c) How can I read the final size of the compressed data? .tellg() is not implemented for these filtering_streams (as mentioned somewhereelse on SO)

Detect it from your underlying output device/stream. Don't forget to flush/close the filtering layer before you do.

d) Can you have multiple sources, ie my three buffers or do I need to combine them? (see below for my approach).

You can do what you want.

Show Me The Code...

I would reverse the initiative, and make the filter compress on write to an output buffer:

using RawBuffer = std::vector<char>;
using Device = bio::back_insert_device<RawBuffer>;

RawBuffer compressed_buffer; // optionally reserve some size

{
    bio::filtering_ostream filter;
    filter.push(bio::gzip_compressor());
    filter.push(Device{ compressed_buffer });

    filter.write(reinterpret_cast<char const*>(&frame.buf1),
                 sizeof(frame) - offsetof(Frame, buf1));
}

To use a filtering streambuf instead:

{
    bio::filtering_ostreambuf filter;
    filter.push(bio::gzip_compressor());
    filter.push(Device{ compressed_buffer });

    std::copy_n(reinterpret_cast<char const*>(&frame.buf1),
                sizeof(frame) - offsetof(Frame, buf1),
                std::ostreambuf_iterator<char>(&filter));
}

Now the answers to your question are standing out:

const char *compressed = compressed_buffer.data();
int compressedSize = compressed_buffer.size();

I would reduce the remaining code to:

{
    std::ofstream ofs("ouput.data", std::ios::binary | std::ios::app);
    ofs << "START";
    ofs << "END " << compressed_buffer.size();
    ofs.write(compressed_buffer.data(), compressed_buffer.size());
}

Consider not reopening the output stream for each frame:)

Live Demo

Live On Coliru

#include <boost/iostreams/filtering_streambuf.hpp>
#include <boost/iostreams/filter/gzip.hpp>
#include <boost/iostreams/device/back_inserter.hpp>
#include <iterator>
#include <fstream>
#include <vector>
namespace bio = boost::iostreams;

class Frame {
private:
    /* other things */
public:
    float buf1[3];
    float buf2[3];
    float buf3[4];
    /* more things */
};

int main() {
    Frame const frames[]{
        {
            { 1, 2, 3 },
            { 4, 5, 6 },
            { 7, 8, 9, 10 },
        },
        {
            { 11, 12, 13 },
            { 14, 15, 16 },
            { 17, 18, 19, 20 },
        },
        {
            { 21, 22, 23 },
            { 24, 25, 26 },
            { 27, 28, 29, 30 },
        },
    };

    // avoiding UB:
    static_assert(std::is_trivial_v<Frame> &&
                  std::is_standard_layout_v<Frame>);

    using RawBuffer = std::vector<char>;
    using Device = bio::back_insert_device<RawBuffer>;

    std::remove("output.data");
    std::ofstream ofs("output.data", std::ios::binary | std::ios::app);

    RawBuffer compressed_buffer; // optionally reserve some size

    for (Frame const& frame : frames) {
        compressed_buffer.clear(); // do not shrink_to_fit optimizing allocation

        {
            bio::filtering_ostreambuf filter;
            filter.push(bio::gzip_compressor());
            filter.push(Device{ compressed_buffer });

            std::copy_n(reinterpret_cast<char const*>(&frame.buf1),
                        sizeof(frame) - offsetof(Frame, buf1),
                        std::ostreambuf_iterator<char>(&filter));
        }

        ofs << "START";
        ofs << "END " << compressed_buffer.size();
        ofs.write(compressed_buffer.data(), compressed_buffer.size());
    }
}

Deterministically generates output.data:

00000000: 5354 4152 5445 4e44 2035 301f 8b08 0000  STARTEND 50.....
00000010: 0000 0000 ff63 6068 b067 6060 7000 2220  .....c`h.g``p." 
00000020: 6e00 e205 407c 0088 1f00 3183 2303 8300  n...@|....1.#...
00000030: 102b 3802 0058 a049 af28 0000 0053 5441  .+8..X.I.(...STA
00000040: 5254 454e 4420 3438 1f8b 0800 0000 0000  RTEND 48........
00000050: 00ff 6360 3070 6460 7000 e200 204e 00e2  ..c`0pd`p... N..
00000060: 0220 6e00 e20e 209e 00c4 3380 7881 2300  . n... ...3.x.#.
00000070: 763b 7371 2800 0000 5354 4152 5445 4e44  v;sq(...STARTEND
00000080: 2034 391f 8b08 0000 0000 0000 ff63 6058   49..........c`X
00000090: e1c8 c0b0 0188 7700 f101 203e 01c4 1780  ......w... >....
000000a0: f806 103f 00e2 1740 fcc1 1100 dfb4 6cde  ...?...@......l.
000000b0: 2800 0000                                (...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM