简体   繁体   中英

What is the most portable way to read and write the highest bit of an integer in C?

What is the most portable way to read and write the highest bit of an integer in C?

This is a Bloomberg interview question. I didn't give best answer at that time. Can anyone please answer it?

If the type is unsigned, it's easy:

(type)-1-(type)-1/2

For signed values, I know no way. If you find a way, it would answer several unanswered questions on SO:

C question: off_t (and other signed integer types) minimum and maximum values

Is there any way to compute the width of an integer type at compile-time?

Maybe others.

First, note that there's no portable way to access the top bit if we're talking about signed integers; there's simply no single portable representation defined in the standard, so the meaning of 'top bit' can in principle vary. Additionally, C does not allow direct access to the bitwise representation; you can access the int as a char buffer, but you have no idea where the 'top bit' is located.

If we're only concerned with the non-negative range of a signed integer, and assuming said range has a size that is a power of two (if not, then we need to care about the signed representation again):

#define INT_MAX_BIT (INT_MAX - (INT_MAX >> 1))
#define SET_MAX_BIT(x) (x | INT_MAX_BIT)
#define CLEAR_MAX_BIT(x) (x & ~INT_MAX_BIT)

A similar approach can be used with unsigned ints, where it can be used to get the true top bit.

Here's a silly one, using:

Built-in Function: int __builtin_clz (unsigned int x)

Returns the number of leading 0-bits in x, starting at the most
significant bit position. If x is 0, the result is undefined. 

First attempt:

int get_msb(int x) { return x ? __buildin_clz(x) == 0 : 0; }

Note: it's a quirk of C that functions specifying int or unsigned int parameters can be called with the other type without warning. But, this probably involves a conversion - the C++ Standard 4.7.2 says:

If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type). [Note: In a two's complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). ]

Which implies that the bit pattern may be changed if it's not a two's complement representation, which would stop this "solution" working reliably too. :-(

Chris's comment below provides a solution (incorporated here as a function rather than preprocessor macro):

int get_msb(int x) { return x ? __buildin_clz(*(unsigned*)&x) == 0 : 0; }

What's wrong with this one?

int get_msb(int n){
    return ((unsigned)n) >> (sizeof(unsigned) * CHAR_BIT - 1);
    // or, optionally
    return n < 0;
};

int set_msb(int n, int msb){
    if (msb)
         return ((unsigned)n) |  (1ULL << (sizeof(unsigned) * CHAR_BIT - 1));
    else return ((unsigned)n) & ~(1ULL << (sizeof(unsigned) * CHAR_BIT - 1));
};

It takes care of endianness, number of bits in a byte, and works also on 1's complement.

#define HIGH_BIT(inttype) (((inttype)1) << (CHAR_BIT * sizeof(inttype) - 1))

example usage:

ptrdiff_t i = 4711;
i |=  HIGH_BIT(ptrdiff_t);  /* set high bit */
i &= ~HIGH_BIT(ptrdiff_t);  /* clear high bit */

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM