简体   繁体   English

大多数基本类型的高效二进制序列化

[英]Efficient Binary Serialization Of Mostly Basic Types

I'm trying to figure out the best approach for transferring some data over the network.我试图找出通过网络传输一些数据的最佳方法。 Here is what i'm hoping to achieve:这是我希望实现的目标:

The application runs and computes some data:应用程序运行并计算一些数据:

int w = 5;
float x = 4.736;
std::string y = "Some String.";
std::vector<int> z;
z.push_back(1);
z.push_back(2);
z.push_back(3);

Then we put it in a binary container:然后我们把它放在一个二进制容器中:

BinaryContainer Data;
Data.Write(w);
Data.Write(x);
Data.Write(y);
Data.Write(z);

We then transfer it over the network:然后我们通过网络传输它:

SendData(Data.c_str());

And read it out on the other side:并在另一边读出:

BinaryContainer ReceivedData(IncomingData);
int w = ReceivedData.Read();
float x = ReceivedData.Read();
std::string y = ReceivedData.Read();
std::vector<int> z = ReceivedData.Read();

The example above outlines how the basic functionality from a high level perspective should work.上面的示例从高级角度概述了基本功能应该如何工作。 I've looked at many different serialization libraries and none seem to fit quite right.我查看了许多不同的序列化库,但似乎没有一个非常合适。 I'm leaning towards learning how to write the functionality myself.我倾向于学习如何自己编写功能。

Endianness doesn't matter.字节序无关紧要。 The architecture that reads and writes data will never differ.读取和写入数据的架构永远不会不同。 We only need to store binary data inside the container.我们只需要在容器内存储二进制数据。 The reading application and writing application is exclusively responsible for reading data in the same order it was written.读取应用程序和写入应用程序专门负责以与写入相同的顺序读取数据。 Only basic types need to be written, no entire arbitrary classes or pointers to things.只需要编写基本类型,不需要完整的任意类或指向事物的指针。 Most importantly overall the speed in which this occurs should be of the highest priority because once the data is formulated, we need to write it to the container, transfer it over the network, and read it on the other end as fast as possible.最重要的是,发生这种情况的速度应该是最高优先级,因为一旦数据被公式化,我们需要将其写入容器,通过网络传输,并在另一端以尽可能快的速度读取。

Network transmission is currently being done using the low level WinSock RIO API and we're moving data from the application to the wire as fast as possible already.目前正在使用低级 WinSock RIO API 完成网络传输,我们正在尽快将数据从应用程序移动到线路。 Transmission latency across the wire will always be a much higher and variable rate.线路上的传输延迟将始终是更高且可变的速率。 The point at which we serialize our data before transmission is the next step in the chain to ensure we are wasting as little time as possible before getting our data out on the wire.我们在传输之前序列化数据的点是链中的下一步,以确保我们在将数据传输到网络之前浪费尽可能少的时间。

New packets will be received very quickly, and as such the ability to preallocate resources would be beneficial.将非常快地接收新数据包,因此预分配资源的能力将是有益的。 For example:例如:

Serializer DataHandler;
...
void NewIncomingPacket(const char* Data)
{
    DataHandler.Reset();
    DataHandler.Load(Data);
    int x = DataHandler.Read();
    float y = DataHandler.Read();
    ...
}

I'm looking for input from community experts on which direction to go here.我正在寻找社区专家的意见,告诉他们该往哪个方向发展。

I've written seriously , an header-only fast C++ library that should do what you want :-)我写得很认真,一个只有头文件的快速 C++ 库,应该可以做你想做的 :-)

It provides both a serializer and a de-serializer.它提供了一个序列化器和一个反序列化器。

Serialized data is portable across different architectures and endianness.序列化数据可以跨不同架构和字节序移植。 No external dependencies.没有外部依赖。

    seriously::Packer<1024> packer;     // a 1024 byte serialization buffer

    int32_t value1 = 83656;
    bool value2 = true;
    int16_t value3 = -2345;
    std::string value4("only an example");
    double value5 = -6.736;
    std::vector<int64_t> value6;

    value6.push_back(42);
    value6.push_back(11);
    value6.push_back(93);

    packer << value1 << value2 << value3 << value4 << value5 << value6;

    std::cout << "packed size: " << packer.size() << std::endl;
    // packer.data() contains the serialized data

    int32_t restored1;
    bool restored2;
    int16_t restored3;
    std::string restored4;
    double restored5 = -6.736;
    std::vector<int64_t> restored6;

    packer >> restored1 >> restored2 >> restored3 >> restored4 >> restored5 >> restored6;

    std::cout << "unpacked: " << restored1 << " " << (restored2 ? "t" : "f") << " " << restored3 << " " << restored4 << " " << restored5 << std::endl;

    std::vector<int64_t>::const_iterator it;
    for (it = restored6.begin(); it != restored6.end(); it++) {
        std::cout << *it << std::endl;
    }

If you don't care about endianness and only want to serialize trivial types than a simple memcpy will be the fastest and also safe.如果您不关心字节序并且只想序列化简单的类型,那么简单的 memcpy 将是最快且安全的。 Just memcpy into/out of the buffer when serializing/deserializing.序列化/反序列化时,只需 memcpy 进入/退出缓冲区。

#include <iostream>
#include <vector>
#include <cstring>
#include <cstdint>
#include <type_traits>
#include <cstddef>

template <std::size_t CapacityV>
struct BinaryContainer
{
    BinaryContainer() :
        m_write(0),
        m_read(0)
    {
    }

    template <typename T>
    void write(const std::vector<T>& vec)
    {
        static_assert(std::is_trivial_v<T>);

        // TODO: check if access is valid

        const std::size_t bytes = vec.size() * sizeof(T);
        std::memcpy(m_buffer + m_write, vec.data(), bytes);
        m_write += bytes;
    }

    template <typename T>
    void write(T value)
    {
        static_assert(std::is_trivial_v<T>);

        // TODO: check if access is valid

        const std::size_t bytes = sizeof(T);
        std::memcpy(m_buffer + m_write, &value, bytes);
        m_write += bytes;
    }

    template <typename T>
    std::vector<T> read(std::size_t count)
    {
        static_assert(std::is_trivial_v<T>);

        // TODO: check if access is valid

        std::vector<T> result;
        result.resize(count);

        const std::size_t bytes = count * sizeof(T);
        std::memcpy(result.data(), m_buffer + m_read, bytes);
        m_read += bytes;

        return result;
    }

    template <typename T>
    T read()
    {
        static_assert(std::is_trivial_v<T>);

        // TODO: check if access is valid

        T result;

        const std::size_t bytes = sizeof(T);
        std::memcpy(&result, m_buffer + m_read, bytes);
        m_read += bytes;

        return result;
    }

    const char* data() const
    {
        return m_buffer;
    }

    std::size_t size() const
    {
        return m_write;
    }

private:
    std::size_t m_write;
    std::size_t m_read;
    char m_buffer[CapacityV]; // or a dynamically sized equivalent
};

int main()
{

    BinaryContainer<1024> cont;

    {
        std::vector<std::uint32_t> values = {1, 2, 3, 4, 5};
        // probably want to make serializing size part of the vector serializer
        cont.write(values.size());
        cont.write(values);
    }

    {
        auto size = cont.read<std::vector<std::uint32_t>::size_type>();
        auto values = cont.read<std::uint32_t>(size);

        for (auto val : values) std::cout << val << ' ';
    }
}

Demo: http://coliru.stacked-crooked.com/a/4d176a41666dbad1演示: http : //coliru.stacked-crooked.com/a/4d176a41666dbad1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM