简体   繁体   中英

Efficient Binary Serialization Of Mostly Basic Types

I'm trying to figure out the best approach for transferring some data over the network. Here is what i'm hoping to achieve:

The application runs and computes some data:

int w = 5;
float x = 4.736;
std::string y = "Some String.";
std::vector<int> z;
z.push_back(1);
z.push_back(2);
z.push_back(3);

Then we put it in a binary container:

BinaryContainer Data;
Data.Write(w);
Data.Write(x);
Data.Write(y);
Data.Write(z);

We then transfer it over the network:

SendData(Data.c_str());

And read it out on the other side:

BinaryContainer ReceivedData(IncomingData);
int w = ReceivedData.Read();
float x = ReceivedData.Read();
std::string y = ReceivedData.Read();
std::vector<int> z = ReceivedData.Read();

The example above outlines how the basic functionality from a high level perspective should work. I've looked at many different serialization libraries and none seem to fit quite right. I'm leaning towards learning how to write the functionality myself.

Endianness doesn't matter. The architecture that reads and writes data will never differ. We only need to store binary data inside the container. The reading application and writing application is exclusively responsible for reading data in the same order it was written. Only basic types need to be written, no entire arbitrary classes or pointers to things. Most importantly overall the speed in which this occurs should be of the highest priority because once the data is formulated, we need to write it to the container, transfer it over the network, and read it on the other end as fast as possible.

Network transmission is currently being done using the low level WinSock RIO API and we're moving data from the application to the wire as fast as possible already. Transmission latency across the wire will always be a much higher and variable rate. The point at which we serialize our data before transmission is the next step in the chain to ensure we are wasting as little time as possible before getting our data out on the wire.

New packets will be received very quickly, and as such the ability to preallocate resources would be beneficial. For example:

Serializer DataHandler;
...
void NewIncomingPacket(const char* Data)
{
    DataHandler.Reset();
    DataHandler.Load(Data);
    int x = DataHandler.Read();
    float y = DataHandler.Read();
    ...
}

I'm looking for input from community experts on which direction to go here.

I've written seriously , an header-only fast C++ library that should do what you want :-)

It provides both a serializer and a de-serializer.

Serialized data is portable across different architectures and endianness. No external dependencies.

    seriously::Packer<1024> packer;     // a 1024 byte serialization buffer

    int32_t value1 = 83656;
    bool value2 = true;
    int16_t value3 = -2345;
    std::string value4("only an example");
    double value5 = -6.736;
    std::vector<int64_t> value6;

    value6.push_back(42);
    value6.push_back(11);
    value6.push_back(93);

    packer << value1 << value2 << value3 << value4 << value5 << value6;

    std::cout << "packed size: " << packer.size() << std::endl;
    // packer.data() contains the serialized data

    int32_t restored1;
    bool restored2;
    int16_t restored3;
    std::string restored4;
    double restored5 = -6.736;
    std::vector<int64_t> restored6;

    packer >> restored1 >> restored2 >> restored3 >> restored4 >> restored5 >> restored6;

    std::cout << "unpacked: " << restored1 << " " << (restored2 ? "t" : "f") << " " << restored3 << " " << restored4 << " " << restored5 << std::endl;

    std::vector<int64_t>::const_iterator it;
    for (it = restored6.begin(); it != restored6.end(); it++) {
        std::cout << *it << std::endl;
    }

If you don't care about endianness and only want to serialize trivial types than a simple memcpy will be the fastest and also safe. Just memcpy into/out of the buffer when serializing/deserializing.

#include <iostream>
#include <vector>
#include <cstring>
#include <cstdint>
#include <type_traits>
#include <cstddef>

template <std::size_t CapacityV>
struct BinaryContainer
{
    BinaryContainer() :
        m_write(0),
        m_read(0)
    {
    }

    template <typename T>
    void write(const std::vector<T>& vec)
    {
        static_assert(std::is_trivial_v<T>);

        // TODO: check if access is valid

        const std::size_t bytes = vec.size() * sizeof(T);
        std::memcpy(m_buffer + m_write, vec.data(), bytes);
        m_write += bytes;
    }

    template <typename T>
    void write(T value)
    {
        static_assert(std::is_trivial_v<T>);

        // TODO: check if access is valid

        const std::size_t bytes = sizeof(T);
        std::memcpy(m_buffer + m_write, &value, bytes);
        m_write += bytes;
    }

    template <typename T>
    std::vector<T> read(std::size_t count)
    {
        static_assert(std::is_trivial_v<T>);

        // TODO: check if access is valid

        std::vector<T> result;
        result.resize(count);

        const std::size_t bytes = count * sizeof(T);
        std::memcpy(result.data(), m_buffer + m_read, bytes);
        m_read += bytes;

        return result;
    }

    template <typename T>
    T read()
    {
        static_assert(std::is_trivial_v<T>);

        // TODO: check if access is valid

        T result;

        const std::size_t bytes = sizeof(T);
        std::memcpy(&result, m_buffer + m_read, bytes);
        m_read += bytes;

        return result;
    }

    const char* data() const
    {
        return m_buffer;
    }

    std::size_t size() const
    {
        return m_write;
    }

private:
    std::size_t m_write;
    std::size_t m_read;
    char m_buffer[CapacityV]; // or a dynamically sized equivalent
};

int main()
{

    BinaryContainer<1024> cont;

    {
        std::vector<std::uint32_t> values = {1, 2, 3, 4, 5};
        // probably want to make serializing size part of the vector serializer
        cont.write(values.size());
        cont.write(values);
    }

    {
        auto size = cont.read<std::vector<std::uint32_t>::size_type>();
        auto values = cont.read<std::uint32_t>(size);

        for (auto val : values) std::cout << val << ' ';
    }
}

Demo: http://coliru.stacked-crooked.com/a/4d176a41666dbad1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM