简体   繁体   中英

Binary Serialization of std::bitset

std::bitset has a to_string() method for serializing as a char -based string of 1 s and 0 s. Obviously, this uses a single 8 bit char for each bit in the bitset, making the serialized representation 8 times longer than necessary.
I want to store the bitset in a binary representation to save space. The to_ulong() method is relevant only when there are less than 32 bits in my bitset. I have hundreds.
I'm not sure I want to use memcpy() / std::copy() on the object (address) itself, as that assumes the object is a POD.

The API does not seem to provide a handle to the internal array representation from which I could have taken the address.

I would also like the option to deserialize the bitset from the binary representation.

How can I do this?

This is a possible approach based on explicit creation of an std::vector<unsigned char> by reading/writing one bit at a time...

template<size_t N>
std::vector<unsigned char> bitset_to_bytes(const std::bitset<N>& bs)
{
    std::vector<unsigned char> result((N + 7) >> 3);
    for (int j=0; j<int(N); j++)
        result[j>>3] |= (bs[j] << (j & 7));
    return result;
}

template<size_t N>
std::bitset<N> bitset_from_bytes(const std::vector<unsigned char>& buf)
{
    assert(buf.size() == ((N + 7) >> 3));
    std::bitset<N> result;
    for (int j=0; j<int(N); j++)
        result[j] = ((buf[j>>3] >> (j & 7)) & 1);
    return result;
}

Note that to call the de-serialization template function bitset_from_bytes the bitset size N must be specified in the function call, for example

std::bitset<N> bs1;
...
std::vector<unsigned char> buffer = bitset_to_bytes(bs1);
...
std::bitset<N> bs2 = bitset_from_bytes<N>(buffer);

If you really care about speed one solution that would gain something would be doing a loop unrolling so that the packing is done for example one byte at a time, but even better is just to write your own bitset implementation that doesn't hide the internal binary representation instead of using std::bitset .

As suggested by guys at gamedev.net, one can try using boost::dynamic_bitset since it allows access to internal representation of bitpacked data.

edit: The following does not work as intended. Appearently, "binary format" actually means "ASCII representation of binary".


You should be able to write them to a std::ostream using operator<< . It says here :

[Bitsets] can also be directly inserted and extracted from streams in binary format.

Answering my own question for completeness.

Apparently, there is no simple and portable way of doing this.

For simplicity (though not efficiency), I ended up using to_string , and then creating consecutive 32-bit bitsets from all 32-bit chunks of the string (and the remainder*), and using to_ulong on each of these to collect the bits into a binary buffer.
This approach leaves the bit-twiddling to the STL itself, though it is probably not the most efficient way to do this.

* Note that since std::bitset is templated on the total bit-count, the remainder bitset needs to use some simple template meta-programming arithmetic.

I can't see an obvious way other than converting to a string and doing your own serialization of the string that groups chunks of 8 characters into a single serialized byte.

EDIT: Better is to just iterate over all the bits with operator[] and manually serialize it.

this might help you, it's a little example of various serialization types. I added bitset and raw bit values, that can be used like the below.

(all examples at https://github.com/goblinhack/simple-c-plus-plus-serializer )

class BitsetClass {
public:
    std::bitset<1> a;
    std::bitset<2> b;
    std::bitset<3> c;

    unsigned int d:1; // need c++20 for default initializers for bitfields
    unsigned int e:2;
    unsigned int f:3;
    BitsetClass(void) { d = 0; e = 0; f = 0; }

    friend std::ostream& operator<<(std::ostream &out,
                                    Bits<const class BitsetClass & > const m
    {
        out << bits(my.t.a);
        out << bits(my.t.b);
        out << bits(my.t.c);

        std::bitset<6> s(my.t.d | my.t.e << 1 | my.t.f << 3);
        out << bits(s);

        return (out);
    }

    friend std::istream& operator>>(std::istream &in,
                                    Bits<class BitsetClass &> my)
    {
        std::bitset<1> a;
        in >> bits(a);
        my.t.a = a;

        in >> bits(my.t.b);
        in >> bits(my.t.c);
        std::bitset<6> s;
        in >> bits(s);

        unsigned long raw_bits = static_cast<unsigned long>(s.to_ulong());
        my.t.d = raw_bits & 0b000001;
        my.t.e = (raw_bits & 0b000110) >> 1;
        my.t.f = (raw_bits & 0b111000) >> 3;

        return (in);
    }
};

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM