简体   繁体   中英

C: Bit fields and bitwise operators

My professor has assigned us some homework which uses bit fields, and has given us three macros

# define SETBIT(A, k) { A[k >> 3] |= (01 << (k & 07)); }
# define CLRBIT(A, k) { A[k >> 3] &= ~(01 << (k & 07)); }
# define GETBIT(A, k) (A[k >> 3] & (01 << (k & 07))) >> (k & 07)

which we're not required/expected to fully understand, only to use to complete the assignment. Each of these macros takes an unsigned int and and index (0-8) and sets/gets/clears a bit at that index. I get this and I get how to use it.

What I want to know is exactly what each of these macros does. Can somebody explain this to me like I'm five?

What the macros do

Ignoring the problems outlined in the next section, the macros treat an array of an integral type as an array of 8-bit values, and when asked to work on bit k , processes the k%8 th bit of the k/8 th element of the array.

However, rather than using k % 8 or k / 8 , it uses shifts and masking.

# define SETBIT(A, k) { A[k >> 3] |= (01 << (k & 07)); }
# define CLRBIT(A, k) { A[k >> 3] &= ~(01 << (k & 07)); }
# define GETBIT(A, k) (A[k >> 3] & (01 << (k & 07))) >> (k & 07)
  • k >> 3 shifts a value right by 3 bit positions, effectively dividing by 8.
  • k & 07 extracts the 3 least significant bits (because 07 octal or 7 decimal is 111 in binary), ignoring the rest.
  • The 01 << (k & 07) shifts the value 1 left by 0..7 bits depending on the value of k & 07 , producing one of the binary values:

     0000 0001 0000 0010 0000 0100 0000 1000 0001 0000 0010 0000 0100 0000 1000 0000 

    Formally, it is actually an int value, and hence probably has 32 bits, but the high order bits are all zeros.

  • The ~ operator converts each 0 bit into a 1 and each 1 bit into a 0.

  • The & operator combines two values, yielding a 1 bit where both bits are 1 and a 0 where either or both bits are 0.
  • The | operator combines two values, yielding a 0 bit where both bits are 0 and a 1 where either or both bits are 1.
  • The assignment operators |= and &= apply the operand on the RHS to the variable on the LHS. The notation a |= b; is equivalent to a = a | b; a = a | b; except that a is evaluated just once. This detail doesn't matter here; it matters intensely if there is an increment or something similar in the expression a .

Putting it all together:

  • SETBIT sets the k th bit (meaning sets it to 1) in the array of 8-bit values represented by A .
  • CLRBIT resets the k th bit (meaning sets it to 0) in the array of 8-bit values represented by A .
  • GETBIT finds the value in the k th bit in the array of 8-bit values represented by A , and returns it as either 0 or 1 — that's what the final >> (k & 07) does.

Nominally, the array elements should be unsigned char to avoid problems with values and wasted space, but any integral type could be used, more or less wastefully. You'd get interesting results if the type is signed char and the high bits are set on the values, or if the type is plain char and plain char is a signed type. You could also get interesting results from GETBIT if the type of A is an integer type bigger than char and the values in the array have bits set outside the last (least significant) 8 bits of the number.

What the macros do not do

The macros provided by the professor are an object lesson in how not to write C preprocessor macros. They do not teach you how to write good C; they teach how to write appallingly awful C.

Each of those macros is dangerously broken because the argument k is not wrapped in parentheses when used. It isn't hard to argue that the same applies to A too. The use of 01 and 07 isn't exactly wrong, but octal 01 and 07 are the same as decimal 1 and 7 .

The GETBIT macro needs an extra level of parentheses around its whole body, too. Given

int y = 2;
unsigned char array[32] = "abcdefghijklmnopqrstuvwxyz01234";

then this does not compile:

int x = GETBIT(array + 3, y + 2) + 13;

This does compile (with warnings) if your compiler options are lax enough, but would produce an eccentric result:

int x = GETBIT(3 + array, y + 2) + 13;

and that's before we try discussing:

int x = GETBIT(3 + array, y++) + 13;

The CLRBIT and SETBIT macros use braces which means that you can't write:

if (GETBIT(array, 13))
    SETBIT(array, 27);
else
    CLRBIT(array, 19);

because the semicolon after SETBIT is a null statement after the close brace in the statement block introduced by SETBIT , so the else clause is simply syntactically incorrect.

The macros could be written like this (retaining the statement block structure for the SETBIT and CLRBIT macros):

#define SETBIT(A, k) do { (A)[(k) >> 3] |= (1 << ((k) & 7)); } while (0)
#define CLRBIT(A, k) do { (A)[(k) >> 3] &= ~(1 << ((k) & 7)); } while (0)
#define GETBIT(A, k) (((A)[(k) >> 3] & (1 << ((k) & 7))) >> ((k) & 7))

The do { … } while (0) notation is a standard technique in macros that gets around the problem of breaking if / else statements.

The macros could also be rewritten like this because assignments are expressions:

#define SETBIT(A, k) ( (A)[(k) >> 3] |=  (1 << ((k) & 7)))
#define CLRBIT(A, k) ( (A)[(k) >> 3] &= ~(1 << ((k) & 7)))
#define GETBIT(A, k) (((A)[(k) >> 3] &   (1 << ((k) & 7))) >> ((k) & 7))

Or, even better, as static inline functions like this:

static inline void SETBIT(unsigned char *A, int k) { A[k >> 3] |=  (1 << (k & 7)); }
static inline void CLRBIT(unsigned char *A, int k) { A[k >> 3] &= ~(1 << (k & 7)); }
static inline int  GETBIT(unsigned char *A, int k) { return (A[k >> 3] & (1 << (k & 7))) >> (k & 7); }

The whole can be assembled into a simple test program:

#if MODE == 1

/* As provided */
#define SETBIT(A, k) { A[k >> 3] |= (01 << (k & 07)); }
#define CLRBIT(A, k) { A[k >> 3] &= ~(01 << (k & 07)); }
#define GETBIT(A, k) (A[k >> 3] & (01 << (k & 07))) >> (k & 07)

#elif MODE == 2

/* As rewritten */
#define SETBIT(A, k) do { (A)[(k) >> 3] |= (1 << ((k) & 7)); } while (0)
#define CLRBIT(A, k) do { (A)[(k) >> 3] &= ~(1 << ((k) & 7)); } while (0)
#define GETBIT(A, k) (((A)[(k) >> 3] & (1 << ((k) & 7))) >> ((k) & 7))

#else

/* As rewritten */
static inline void SETBIT(unsigned char *A, int k) { A[k >> 3] |=  (1 << (k & 7)); }
static inline void CLRBIT(unsigned char *A, int k) { A[k >> 3] &= ~(1 << (k & 7)); }
static inline int  GETBIT(unsigned char *A, int k) { return (A[k >> 3] & (1 << (k & 7))) >> (k & 7); }

#endif

int main(void)
{
    int y = 2;
    unsigned char array[32] = "abcdefghijklmnopqrstuvwxyz01234";
    int x = GETBIT(array + 3, y + 2) + 13;
    int z = GETBIT(3 + array, y + 2) + 13;

    if (GETBIT(array, 3))
        SETBIT(array, 22);
    else
        CLRBIT(array, 27);

    return x + z;
}

When compiled with -DMODE=2 or -DMODE=0 or without any -DMODE setting, then it is clean. When compiled with -DMODE=1 , there are an objectionable number of warnings (errors for me because I use GCC and compile with -Werror which makes any warning into an error).

$ gcc -O3 -g -std=c11 -Wall -Wextra -Werror -DMODE=0 bits23.c -o bits23 
$ gcc -O3 -g -std=c11 -Wall -Wextra -Werror -DMODE=2 bits23.c -o bits23
$ gcc -O3 -g -std=c11 -Wall -Wextra -Werror -DMODE=1 bits23.c -o bits23
bits23.c: In function ‘main’:
bits23.c:28:33: error: suggest parentheses around ‘+’ inside ‘>>’ [-Werror=parentheses]
     int x = GETBIT(array + 3, y + 2) + 13;
                                 ^
bits23.c:6:25: note: in definition of macro ‘GETBIT’
 #define GETBIT(A, k) (A[k >> 3] & (01 << (k & 07))) >> (k & 07)
                         ^
bits23.c:6:24: error: subscripted value is neither array nor pointer nor vector
 #define GETBIT(A, k) (A[k >> 3] & (01 << (k & 07))) >> (k & 07)
                        ^
bits23.c:28:13: note: in expansion of macro ‘GETBIT’
     int x = GETBIT(array + 3, y + 2) + 13;
             ^
bits23.c:28:33: error: suggest parentheses around ‘+’ in operand of ‘&’ [-Werror=parentheses]
     int x = GETBIT(array + 3, y + 2) + 13;
                                 ^
bits23.c:6:43: note: in definition of macro ‘GETBIT’
 #define GETBIT(A, k) (A[k >> 3] & (01 << (k & 07))) >> (k & 07)
                                           ^
bits23.c:28:33: error: suggest parentheses around ‘+’ in operand of ‘&’ [-Werror=parentheses]
     int x = GETBIT(array + 3, y + 2) + 13;
                                 ^
bits23.c:6:57: note: in definition of macro ‘GETBIT’
 #define GETBIT(A, k) (A[k >> 3] & (01 << (k & 07))) >> (k & 07)
                                                         ^
bits23.c:29:33: error: suggest parentheses around ‘+’ inside ‘>>’ [-Werror=parentheses]
     int z = GETBIT(3 + array, y + 2) + 13;
                                 ^
bits23.c:6:25: note: in definition of macro ‘GETBIT’
 #define GETBIT(A, k) (A[k >> 3] & (01 << (k & 07))) >> (k & 07)
                         ^
bits23.c:29:33: error: suggest parentheses around ‘+’ in operand of ‘&’ [-Werror=parentheses]
     int z = GETBIT(3 + array, y + 2) + 13;
                                 ^
bits23.c:6:43: note: in definition of macro ‘GETBIT’
 #define GETBIT(A, k) (A[k >> 3] & (01 << (k & 07))) >> (k & 07)
                                           ^
bits23.c:29:22: error: suggest parentheses around ‘+’ in operand of ‘&’ [-Werror=parentheses]
     int z = GETBIT(3 + array, y + 2) + 13;
                      ^
bits23.c:6:23: note: in definition of macro ‘GETBIT’
 #define GETBIT(A, k) (A[k >> 3] & (01 << (k & 07))) >> (k & 07)
                       ^
bits23.c:29:33: error: suggest parentheses around ‘+’ in operand of ‘&’ [-Werror=parentheses]
     int z = GETBIT(3 + array, y + 2) + 13;
                                 ^
bits23.c:6:57: note: in definition of macro ‘GETBIT’
 #define GETBIT(A, k) (A[k >> 3] & (01 << (k & 07))) >> (k & 07)
                                                         ^
bits23.c:29:38: error: suggest parentheses around ‘+’ inside ‘>>’ [-Werror=parentheses]
     int z = GETBIT(3 + array, y + 2) + 13;
                                      ^
bits23.c:33:5: error: ‘else’ without a previous ‘if’
     else
     ^
cc1: all warnings being treated as errors
$

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM