简体   繁体   中英

C++ Most efficient way to store mutable binary data in a variable?

I'm working with a stream of binary strings and I'd like to assign this data to datatypes in the most memory-efficient way.

Example: int uses 4bytes for a range of -2147483648 to 2147483647

Since my data is just ones and zeros, what is the best datatype to use for a large stream of binary data?

I've tried using bitset, but running functions like reading and substituting 0001 for 1 and 0101 = 5 runs slower compared to integer datatypes.

Is there any other efficient way to store and traverse binary data in a data type?

I suspect what you mean by "more efficient way" is the one that minimizes the Shannon information entropy .

A very common way to encode data in order to minimize said entropy is by using variable length codes .

TL;DR: char*

If you're looking for performance use just a byte array ( char* or char[] ). Try to use high level functions to move/copy the data(memcpy, memmove etc.) It's the interaction with bits that make things slow. Avoid them as much as possible.

You'll get huge speed boosts if the data is byte aligned. You can do things like reinterpret_cast for the pointer type ( int* int_ptr = reinterpret_cast<int*>(char_ptr); int my_int = *int_ptr; ). If it's not, because space might be higher piority, still use high level functions to see large speed boosts. Something like

unsigned int *int_ptr = reinterpret_cast<int*>(char_ptr*);
unsigned int my_int = *int_ptr >> offset_bits + *(int_ptr+1) << (sizeof(int) * 8 -offset_bits);

Also compile with -O2 to let the compiler do magic and maybe make your code faster.

Always make sure you're you have control over the size of the data and not let it be \0 ended or something like that.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM