简体   繁体   中英

Number of trailing zeroes

I've written a function trailing_zeroes(int n) that returns the number of the trailing zeroes in the binary representation of a number.

Example : 4 in binary is 100 , so the function in this case returns 2 .

unsigned trailing_zeroes(int n) {
    unsigned bits;

    bits = 0;
    while (n >= 0 && !(n & 01)) {
        ++bits;
        if (n != 0)
            n >>= 1;
        else
            break;
    }
    return bits;
}

The reason of the if statement is because in case n equals to 0, there will be a loop.

I think it's pretty ugly this code written like this; is there a better way?

I want to avoid the break statement inside the while , because a lot of people told me that using that statement inside while/for sometimes could be "informal". I thought to rewrite the function like this, but I don't think it's the best way to do it:

unsigned bits;
if (n == 0)
    return bits = 1;

bits = 0;
while (!(n & 01)) {
    ++bits;
    n >>= 1;
}

Your function is incorrect: it still has an infinite loop for 0 . The test should be:

while (n > 0 && !(n & 1))

Note that you cannot handle negative numbers with this approach, hence your function should probably take an unsigned number argument, or you could convert the argument to unsigned .

Your function should special case 0 and use a simpler loop:

unsigned trailing_zeroes(int n) {
    unsigned bits = 0, x = n;

    if (x) {
        while ((x & 1) == 0) {
            ++bits;
            x >>= 1;
        }
    }
    return bits;
}

The above function is very simple and easy to understand. It is quite fast if the result is small. The value returned for 0 is 0 as in your function, which is questionnable as 0 really has as many trailing zeroes as value bits in the unsigned type.

There is a more efficient approach with a constant number of steps:

unsigned trailing_zeroes(int n) {
    unsigned bits = 0, x = n;

    if (x) {
        /* assuming `x` has 32 bits: lets count the low order 0 bits in batches */
        /* mask the 16 low order bits, add 16 and shift them out if they are all 0 */
        if (!(x & 0x0000FFFF)) { bits += 16; x >>= 16; }
        /* mask the 8 low order bits, add 8 and shift them out if they are all 0 */
        if (!(x & 0x000000FF)) { bits +=  8; x >>=  8; }
        /* mask the 4 low order bits, add 4 and shift them out if they are all 0 */
        if (!(x & 0x0000000F)) { bits +=  4; x >>=  4; }
        /* mask the 2 low order bits, add 2 and shift them out if they are all 0 */
        if (!(x & 0x00000003)) { bits +=  2; x >>=  2; }
        /* mask the low order bit and add 1 if it is 0 */
        bits += (x & 1) ^ 1;
    }
    return bits;
}

Note that we could handle any larger int size by changing the first step to

while (!(x & 0x0000FFFF)) { bits += 16; x >>= 16; }

Some compilers have a built-in function __builtin_ctz() to count the number of trailing zeroes using very efficient assembly code. It is not a C Standard function but at the cost of reduced portability, you might want to use it if it is available. Check your compiler's documentation.

Here is the abstract from GCC docuemntation :

Built-in Function: int __builtin_ctz (unsigned int x)

Returns the number of trailing 0-bits in x , starting at the least significant bit position. If x is 0 , the result is undefined.

As already mentioned, there is a builtin that can do this, and as it may use hardware, it may be very fast. However, the doc for GCC does say that the result is undefined if the input is 0. Since this is an extension it may not be available for your compiler.

Otherwise, whenever anyone says 'but manipulation' or 'bit counting', you need to reach for your copy of "Hacker's Delight" . A book so good that I bought both editions. There are about 4 pages (1st ed.) dedicated to this, 'ntz' (number of trailing zeroes). If you already have a 'nlz' (number of leading zeroes) or a 'popcnt' function, then you can get the ntz directly. Otherwise the book gives several implementations , some using popcnt, one with a loop, the others using binary search.

For instance,

int ntz3(unsigned x) {
   int n;

   if (x == 0) return(32);
   n = 1;
   if ((x & 0x0000FFFF) == 0) {n = n +16; x = x >>16;}
   if ((x & 0x000000FF) == 0) {n = n + 8; x = x >> 8;}
   if ((x & 0x0000000F) == 0) {n = n + 4; x = x >> 4;}
   if ((x & 0x00000003) == 0) {n = n + 2; x = x >> 2;}
   return n - (x & 1);
}

There are a whole variety of methods for ntz reported in "Hacker's Delight" by Henry Warren.

I thought the De Bruijn sequence solution was pretty wild. See https://en.wikipedia.org/wiki/De_Bruijn_sequence#Finding_least-_or_most-significant_set_bit_in_a_word .

Here is a 64-bit implementation, like one would use in a chess engine to handle a "bitboard".

int ntz(uint64_t x) {
    // We return the number of trailing zeros in
    // the binary representation of x.
    //
    // We have that 0 <= x < 2^64.
    //
    // We begin by applying a function sensitive only
    // to the least significant bit (lsb) of x:
    //
    //   x -> x^(x-1)  e.g. 0b11001000 -> 0b00001111
    //
    // Observe that x^(x-1) == 2^(ntz(x)+1) - 1.

    uint64_t y = x^(x-1);

    // Next, we multiply by 0x03f79d71b4cb0a89,
    // and then roll off the first 58 bits.

    constexpr uint64_t debruijn = 0x03f79d71b4cb0a89;

    uint8_t z = (debruijn*y) >> 58;

    // What? Don't look at me like that.
    //
    // With 58 bits rolled off, only 6 bits remain,
    // so we must have one of 0, 1, 2, ..., 63.
    //
    // It turns out this number was judiciously
    // chosen to make it so each of the possible
    // values for y were mapped into distinct slots.
    //
    // So we just use a look-up table of all 64
    // possible answers, which have been precomputed in 
    // advance by the the sort of people who write
    // chess engines in their spare time:

    constexpr std::array<int,64> lookup = {
         0, 47,  1, 56, 48, 27,  2, 60,
        57, 49, 41, 37, 28, 16,  3, 61,
        54, 58, 35, 52, 50, 42, 21, 44,
        38, 32, 29, 23, 17, 11,  4, 62,
        46, 55, 26, 59, 40, 36, 15, 53,
        34, 51, 20, 43, 31, 22, 10, 45,
        25, 39, 14, 33, 19, 30,  9, 24,
        13, 18,  8, 12,  7,  6,  5, 63
    };

    return lookup[z];
}

C++20 introduced the <bit> header, which provides countr_zero . The linked cppreference page provides the example:

#include <bit>
#include <bitset>
#include <cstdint>
#include <iostream>
 
int main()
{
    for (const std::uint8_t i : { 0, 0b11111111, 0b00011100, 0b00011101 }) {
        std::cout << "countr_zero( " << std::bitset<8>(i) << " ) = "
                  << std::countr_zero(i) << '\n';
    }
}

Output:

countr_zero( 00000000 ) = 8
countr_zero( 11111111 ) = 0
countr_zero( 00011100 ) = 2
countr_zero( 00011101 ) = 0

If you need a value of 0 for a 0 input, you can do something like:

(n ? countr_zero(n) : 0)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM