简体   繁体   中英

Why doesn't C have binary literals?

I am frequently wishing I could do something like this in c:

val1 &= 0b00001111; //clear high nibble
val2 |= 0b01000000; //set bit 7
val3 &= ~0b00010000; //clear bit 5

Having this syntax seems like an incredibly useful addition to C with no downsides that I can think of, and it seems like a natural thing for a low level language where bit-twiddling is fairly common.

Edit: I'm seeing some other great alternatives but they all fall apart when there is a more complex mask. For example, if reg is a register that controls I/O pins on a microcontroller, and I want to set pins 2, 3, and 7 high at the same time I could write reg = 0x46; but I had to spend 10 seconds thinking about it (and I'll likely have to spend 10 seconds again every time I read those code after a not looking at it for a day or two) or I could write reg = (1 << 1) | (1 << 2) | (1 << 6); reg = (1 << 1) | (1 << 2) | (1 << 6); but personally I think that is way less clear than just writing `reg = 0b01000110;' I can agree that it doesn't scale well beyond 8 bit or maybe 16 bit architectures though. Not that I've ever needed to make a 32 bit mask.

According to Rationale for International Standard - Programming Languages C §6.4.4.1 Integer constants

A proposal to add binary constants was rejected due to lack of precedent and insufficient utility.

It's not in standard C, but GCC supports it as an extension, prefixed by 0b or 0B :

 i = 0b101010;

See here for detail.

This is what pushed hexadecimal to be... hexadecimal. The "... primary use of hexadecimal notation is a human-friendly representation of binary-coded values in computing and digital electronics ... ". It would be as follows:

val1 |= 0xF;
val2 &= 0x40;
val3 |= ~0x10;

Hexadecimal:

  1. One hex digit can represent a nibble (4 bits or half an octal).
  2. Two hex digits can represent a byte (8 bits).
  3. Hex is much more compact when scaling to larger masks.

With some practice, converting between hexadecimal and binary will become much more natural. Try writing out your conversions by hand and not using an online bin/hex notation converter -- then in a couple days it will become natural (and quicker as a result).

Aside: Even though binary literals are not a C standard, if you compile with GCC it is possible to use binary literals, they should be prefixed with '0b' or '0B'. See the official documentation here for further information. Example:

int b1 = 0b1001; // => 9
int b2 = 0B1001; // => 9

All of your examples can be written even more clearly:

val1 &= (1 << 4) - 1; //clear high nibble
val2 |= (1 << 6); //set bit 6
val3 &=~(1 << 3); //clear bit 3

(I have taken the liberty of fixing the comments to count from zero, like Nature intended.)

Your compiler will fold these constants, so there is no performance penalty to writing them this way. And these are easier to read than the 0b... versions.

I think readability is a primary concern. Although low-level, it's human beings who read and maintain your code, not machine.

Is it easy for you to figure out that you mistakenly typed 0b1000000000000000000000000000000(0x40000000) , where you really mean 0b10000000000000000000000000000000(0x80000000) ?

"For example, if reg is a register that controls I/O pins on a microcontroller"

I can't help thinking this is a bad example. Bits in control registers have specific functions (as will any devices connected to individual IO bits).

It would be far more sensible to provide symbolic constants for bit patterns in a header file, rather than working out the binary within the code. Converting binary to hexadecimal or octal is trivial, remembering what happens when you write 01000110 to an IO register is not, particularly if you don't have the datasheet or circuit diagram handy.

You will then not only save those 10 seconds trying to work out the binary code, but maybe the somewhat longer time trying to work out what it does!

I recommend C macros in C for this to avoid compiler warnings or other problems. Instead of 0x I use Ox (like in "Ohio").

#define Ob00000001 1
#define Ob10000000 (1 << (8-1))
#define Ob00001111 15
#define Ob11110000_8 (Ob00001111 << (8 - 4))
#define Ob11110000_16 (Ob00001111 << (16 - 4))
#define Ob11110000_32 (((uint32_t) Ob00001111) << (32 - 4))
#define Ob11110000_64 (((uint64_t) Ob00001111) << (64 - 4))
#define Ox0F Ob00001111
#define OxF0 Ob11110000_8
#define OxF000 Ob11110000_16
#define OxF0000000 Ob11110000_32
#define OxF000000000000000 Ob11110000_64

int main() {
    #define Ob00001110 14
    // bitwise operations work
    if (Ob00001110 == (Ob00001111 & ~Ob00000001)) {
        printf("true\n");
    }
}

My approach was:

/* binmacro.h */

#define BX_0000 0
#define BX_0001 1
#define BX_0010 2
#define BX_0011 3
#define BX_0100 4
#define BX_0101 5
#define BX_0110 6
#define BX_0111 7
#define BX_1000 8
#define BX_1001 9
#define BX_1010 A
#define BX_1011 B
#define BX_1100 C
#define BX_1101 D
#define BX_1110 E
#define BX_1111 F

#define BIN_A(x) BX_ ## x

#define BIN_B(x,y) 0x ## x ## y
#define BIN_C(x,y) BIN_B(x,y)

#define BIN_B4(x,y,z,t) 0x ## x ## y ## z ## t
#define BIN_C4(x,y,z,t) BIN_B4(x,y,z,t)

#define BIN(x,y) BIN_C(BIN_A(x),BIN_A(y))
#define BIN4(x,y,z,t) BIN_C4(BIN_A(x),BIN_A(y),BIN_A(z),BIN_A(t))

/*---- test ... ---*/

BIN(1101,0100)

BIN4(1101,0010,1100,0101)

Which preprocesses to...

$  cpp binmacro.h
0xD4

0xD2C5

Binary is most useful when setting specific outputs on a controller. I use a hack which is technically illegal but nonetheless always works. If you just need to turn an LED on it offends every sensibility to use a whole int, or even a char for the job. Don't forget we're probably not talking about the ultimate in compilation sophistication for these things. So, for individual intelligibility combined with group control I use bitfields :-

struct DEMAND
{
    unsigned int dOil   :   1; // oil on
    unsigned int dAir   :   1; // air on
    unsigned int dHeat  :   1; // heater on
    unsigned int dMtr1  :   1; // motor 1 on
    unsigned int dMtr2  :   1; // motor 2 on
    unsigned int dPad1  :   10;// spare demand o/p's
    unsigned int dRunCycle: 1; // GO !!!!
    unsigned int dPad2  :   15;// spare o/p's
    unsigned int dPowerOn:  1; // Power on
}DemandBF;

They're easily addressed when used singly, or for more thorough control they can be treated as an unsigned int in flagrant disregard of K&R:-

void *bitfPt = &DemandBF;
unsigned int *GroupOuts = (unsigned int *)bitfPt;

DemandBF.dAir = 1;   // Clearly describes what's turning on
DemandBF.dPowerOn = 1;

*GroupOuts ^= 0x04; // toggle the heater

*GroupOuts = 0; // kill it all

It's always worked for me, it's probably not portable, but then who actually ports something like this anyhow? Give it a go.

The following is limited to 8 bits, although it should be straightforward to extend. While it does not result in a C literal, it does result in a compile time constant.

#define B_(X) B8_("00000000" #X)
#define B8_(X) B8__(X+sizeof(X)-9)
#define B8__(X) \
        (B___((X), 7) | B___((X), 6) | B___((X), 5) | B___((X), 4) | \
         B___((X), 3) | B___((X), 2) | B___((X), 1) | B___((X), 0))
#define B___(X, I) (((X)[7-(I)] - '0') << (I))

The following function is compiled into code that returns the constant 18 .

int test(void) {
    return B_(10010);
}

Try it online!

If performance is not an issue, you can do something simpler:

#define B_(x) strtoull(#x, 0, 2)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM