简体   繁体   中英

Boolean bit fields vs logical bit masking or bit shifting - C++

I have a series of classes that are going to require many boolean fields, somewhere between 4-10. I'd like to not have to use a byte for each boolean. I've been looking into bit field structs, something like:

struct BooleanBitFields
    {
        bool b1:1;
        bool b2:1;
        bool b3:1;
        bool b4:1;
        bool b5:1;
        bool b6:1;
    };

But after doing some research I see a lot of people saying that this can cause inefficient memory access and not be worth the memory savings. I'm wondering what the best method for this situation is. Should I use bit fields, or use a char with bit masking (and's and or s) to store 8bits? If the second solution is it better to bit shift or use logic?

If anyone could comment as to what method they would use and why it would really help me decide which route I should go down.

Thanks in advance!

With the large address spaces on desktop boxes, an array of 32/64-bit booleans may seem wasteful, and indeed it is, but most developers don't care, (me included). On RAM-restricted embedded controllers, or when accessing hardware in drivers, then sure, use bitfields, otherwise..

One other issue, apart from R/W ease/speed, is that a 32- or 64-bit boolean is thread-safer than one bit in the middle that has to be manipulated by multiple logical operations.

Bit fields are only a recommendation for the compiler. The compiler is free to implement them as it likes. On embedded systems there are compilers that guarantee 1 bit-to-bit mapping. Other compilers don't.

I would go with a regular struct, like yours but no bit fields. Make them unsigned chars - the shortest data type. The struct will make it easier to access them while editing, if your IDE supports auto completion.

Use an int bit array (leaves you lots of space to expand, and there is no advantage to a single char) and test with mask constants:

#define BOOL_A 1
#define BOOL_B 1 << 1
#define BOOL_C 1 << 2
#define BOOL_D 1 << 3

/* Alternately: use const ints for encapsulation */    

// declare and set
int bitray = 0 | BOOL_B | BOOL_D;

// test
if (bitray & BOOL_B) cout << "Set!\n";

I want to write an answer to make sure once again and formalize the thought: "What does the transition from working with bytes to working with bits entail?" And also because the answer "I don't care" seems to me to be unreasonable.

Exploring char vs bitfield

Agree, It's very tempting. Especially when it's supposed to be used like this:

#define FLAG_1 1
#define FLAG_2 (1 << 1)
#define FLAG_3 (1 << 2)
#define FLAG_4 (1 << 3)

struct S1 {
    char flag_1: 1;
    char flag_2: 1;
    char flag_3: 1;
    char flag_4: 1;
}; //sizeof == 1

void MyFunction(struct S1 *obj, char flags) {
    obj->flag_1 = flags & FLAG_1;
    obj->flag_2 = flags & FLAG_2;
    obj->flag_3 = flags & FLAG_3;
    obj->flag_4 = flags & FLAG_4;
    // we desire it to be as *obj = flags;
}

int main(int argc, char **argv)
{
    struct S1 obj;
    MyFunction(&obj, FLAG_1 | FLAG_2 | FLAG_3 | FLAG_4);
    
    return 0;
}

But let's cover all aspects of such optimization. Let's decompose the operation into simpler C-commands, roughly corresponding to the assembler commands:

  1. Initialization of all flags.
    char flags = FLAG_1 | FLAG_3;
    //obj->flag_1 = flags & FLAG_1;
    //obj->flag_2 = flags & FLAG_2;
    //obj->flag_3 = flags & FLAG_3;
    //obj->flag_4 = flags & FLAG_4;
    *obj = flags;
  1. Writing one flag as a constant
    //obj.flag_3 = 1;
    char a = *obj;
    a &= ~FLAG_3;
    a |= FLAG_3;
    *obj = a;
  1. Write a single flag using a variable
    char b = 3;
    //obj.flag_3 = b;
    char a = *obj;
    a &= ~FLAG_3;
    
    char c = b;
    c <<= 3;
    c &= ~FLAG_3; //Fixing b > 1
    
    a |= c;
    *obj = a;
  1. Reading one flag into variable
    //char f = obj.flag_3;
    char f = *obj;
    f >>= 3;
    f &= 0x01;
  1. Write one flag to another
  //obj.flag_2 = obj.flag_4;
  char a = *obj;
  char b = a;
  a &= FLAG_4;
  a <<= 2; //Shift to FLAG_2 position
  b |= a;
  *obj = b;

Resume

Command Cost, bitfield Cost, variable
1. Init 1 4 or less
2. obj.flag_3 = 1; 3 1
3. obj.flag_3 = b; 7 1 or 3 *
4. char f = obj.flag_3; 2 1
5. obj.flag_2 = obj.flag_4; 6 1

*- if we guarantee flag be no more than 1

All operations except initialization take many lines of code. It looks like it would be better for us to leave bit fields alone after initialization)))). However, this is usually what happens to flags all the time. They change their state without warning and randomly.

We are essentially trying to make the rare value initialization operation cheaper by sacrificing frequent value change operations.

There are systems in which bitwise comparison operations, bit set and reset , bit copying and even bit swapping , bit branching , take one cycle. There are even systems in which mutex locking operations are implemented by a single assembler instruction (in such systems, bit fields may not be located on the entire memory area, for example, PIC microcontrollers). in any way it's not a common memory area.

Perhaps in such systems, the bool type could point to a component of the bitfield.

If your desire to save on insignificant bits of a byte has not yet disappeared, try to think about implementing addressability , atomicity of operations, arithmetic with bytes, and the resulting overhead for calls , data memory, code memory, stack if algorithms are placed in functions.

Reflections on the choice of bool or char

If your target platform decodes the bool type as 2 bytes or 4 or more. That most likely operations with bits on it will not be optimized. Rather, it is a platform for high-volume computing. This means that bit operations are not so in demand on it, in addition, operations with bytes and words are not so in demand on it.

In the same way that operations on bits hurt performance, operations on a single byte can also greatly increase the number of cycles to access a variable.

No system can be equally optimal for everything at once. Instead of obsessing over memory savings in systems that are clearly built with a lot of memory surplus, pay attention to the strengths of those systems.

Conclusion

Use char or bool if:

  1. You need to store the mutable state or behavior of the algorithm (and change and return flags individually).
  2. Your flag does not accurately describe the system and could evolve into a number.
  3. You need to be able to access the flag by address.
  4. If your code claims to be platform independent and there is no guarantee that bit operations will be optimized on the target platform.

Use bitfields if:

  1. You need to store a huge number of flags without having to constantly read and rewrite them.
  2. You have unusually tight memory requirements, or memory is low.
  3. In other deeply justified cases, with calculations and confirming experiments.

Perhaps a short rule might be:

Independent flags are stored in a bool .

PS: If you've read this far and still want to save 7 bits out of 8, then consider why there is no desire to use 7 bit bit fields for variables that take a value up to 100 maximum.

References

Raymond Chen: The cost-benefit analysis of bitfields for a collection of booleans

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM