简体   繁体   English

位域的二进制“批量”序列化

[英]Binary “bulk” serialization of bitfields

I have objects with lots of boolean properties, so I am using bitfields to pack the properties more compact. 我的对象具有许多布尔属性,因此我使用位域将属性压缩得更紧凑。 I also want to be able to serialize and deserializa those properties in a compact way, eg not field by field but by serializing and deserializing the 64 bit uint that holds the fields. 我还希望能够以一种紧凑的方式序列化和反序列化这些属性,例如,不是逐字段进行序列化,而是对包含这些域的64位uint进行序列化和反序列化。 This way is not only much faster (eg avoid all the shifting and stuff) but 8 times more memory efficient. 这种方式不仅速度更快(例如避免所有移位和填充),而且内存效率提高了8倍。

However, I read that the standard provides no guarantee that bitfield implementation will be uniform across different platforms. 但是,我读到该标准不能保证位域实现在不同平台上是统一的。 Can I expect that "bulk" binary serialization of the bitfield container will produce uniform results across platforms? 我可以期望位域容器的“批量”二进制序列化将在整个平台上产生一致的结果吗? Or perhaps it will be safer to go for manual shifting and masking when working with the properties so that bulk serialization and deserialization is possible? 还是在使用属性时进行手动移位和屏蔽会更安全,以便可以进行批量序列化和反序列化?

You can have a look at std::bitset : 您可以看一下std :: bitset

It provides well defined functions to cast your bits into an unsigned long long , and create a bitset from a stored unsigned long long . 它提供了定义明确的函数,可将您的位转换为unsigned long long ,并从存储的unsigned long long创建位集 It is defined that the first bit in a bitset is the least significant digit of the ullong representation. 定义位集中的第一位是ullong表示的最低有效位。

so you can have something like: 这样您就可以拥有以下内容:

 std::bitset<N> bits;
 unsigned long long val = bits.to_ullong();
 // serialize your ullong value
 // load ullong from serialized data
 unsigned long long val2 = ...;
 std::bitset<N> newBits(val2);

So, as long as your serialization can load/store unsigned long long correctly, you are good to go. 因此,只要您的序列化可以正确unsigned long long正确地加载/存储unsigned long long ,您就可以进行了。

The only problem is when you have a bitfield which is too big for a unsigned long long . 唯一的问题是,当您的位域对于unsigned long long太大时。 In this case, the standard provides no simple way to extract the bitfield. 在这种情况下,标准没有提供提取位域的简单方法。

One possibility would be to use ASN.1 to handle this via a BIT STRING. 一种可能是使用ASN.1通过BIT STRING处理此问题。 It precisely defines the serialization in a way that is independent of local representation. 它以独立于本地表示的方式精确定义了序列化。 This allows it to be consistent across platforms regardless of whether the local platform is big-endian or little-endian. 这样,无论本地平台是big-endian还是little-endian,它在各个平台上都可以保持一致。 You can play with a free online ASN.1 compiler and encoder/decoder at http://asn1-playground.oss.com to see the resulting serialization. 您可以在http://asn1-playground.oss.com上使用免费的在线ASN.1编译器和编码器/解码器,以查看生成的序列化。

ASN.1 allows you also to give a "name" to each bit so that you can easily set or check each named bit in a bit string. ASN.1还允许您为每个位赋予一个“名称”,以便您可以轻松地设置或检查位字符串中的每个命名位。

The variation in endian-ness of platforms suggest that any such serialization would be non-portable. 平台字节序的变化表明任何此类序列化都是不可移植的。 Based on that I would say that you cannot expect bulk binary serialization of the bitfield container to be uniform across platforms. 基于此我想说,您不能期望位域容器的批量二进制序列化在各个平台之间是一致的。

A solution would have to account for the bit ordering and correct depending on the platform. 解决方案必须考虑到位排序并根据平台进行校正。

this might help you, it's a little example of various serialization types. 这可能会对您有所帮助,这只是各种序列化类型的一个小例子。 I added bitset and raw bit values, that can be used like the below. 我添加了位集和原始位值,可以像下面这样使用。

(all examples at https://github.com/goblinhack/simple-c-plus-plus-serializer ) (所有示例位于https://github.com/goblinhack/simple-c-plus-plus-serializer

class BitsetClass {
public:
    std::bitset<1> a;
    std::bitset<2> b;
    std::bitset<3> c;

    unsigned int d:1; // need c++20 for default initializers for bitfields
    unsigned int e:2;
    unsigned int f:3;
    BitsetClass(void) { d = 0; e = 0; f = 0; }

    friend std::ostream& operator<<(std::ostream &out,
                                    Bits<const class BitsetClass & > const m
    {
        out << bits(my.t.a);
        out << bits(my.t.b);
        out << bits(my.t.c);

        std::bitset<6> s(my.t.d | my.t.e << 1 | my.t.f << 3);
        out << bits(s);

        return (out);
    }

    friend std::istream& operator>>(std::istream &in,
                                    Bits<class BitsetClass &> my)
    {
        std::bitset<1> a;
        in >> bits(a);
        my.t.a = a;

        in >> bits(my.t.b);
        in >> bits(my.t.c);
        std::bitset<6> s;
        in >> bits(s);

        unsigned long raw_bits = static_cast<unsigned long>(s.to_ulong());
        my.t.d = raw_bits & 0b000001;
        my.t.e = (raw_bits & 0b000110) >> 1;
        my.t.f = (raw_bits & 0b111000) >> 3;

        return (in);
    }
};

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM