简体   繁体   中英

Efficient algorithm to calculate the sum of number of base2 digits (number of bits) over an interval of positive integers

Let's say I've been given two integers a , b where a is a positive integer and is smaller than b . I have to find an efficient algorithm that's going to give me the sum of number of base2 digits (number of bits) over the interval [a, b] . For example, in the interval [0, 4] the sum of digits is equal to 9 because 0 = 1 digit, 1 = 1 digit, 2 = 2 digits, 3 = 2 digits and 4 = 3 digits.

My program is capable of calculating this number by using a loop but I'm looking for something more efficient for large numbers. Here are the snippets of my code just to give you an idea:

int numberOfBits(int i) {
    if(i == 0) {
        return 1;
    }
    else {
        return (int) log2(i) + 1;
    }
 }

The function above is for calculating the number of digits of one number in the interval.

The code below shows you how I use it in my main function.

for(i = a; i <= b; i++) {
    l = l + numberOfBits(i);
}
printf("Digits: %d\n", l);

Ideally I should be able to get the number of digits by using the two values of my interval and using some special algorithm to do that.

Try this code, i think it gives you what you are needing to calculate the binaries:

int bit(int x)
{
  if(!x) return 1;
  else
  {
    int i;
    for(i = 0; x; i++, x >>= 1);
    return i;
  }
}

First, we can improve the speed of log2, but that only gives us a fixed factor speed-up and doesn't change the scaling.

Faster log2 adapted from: https://graphics.stanford.edu/~seander/bithacks.html#IntegerLogLookup

The lookup table method takes only about 7 operations to find the log of a 32-bit value. If extended for 64-bit quantities, it would take roughly 9 operations. Another operation can be trimmed off by using four tables, with the possible additions incorporated into each. Using int table elements may be faster, depending on your architecture.

Second, we must re-think the algorithm. If you know that numbers between N and M have the same number of digits, would you add them up one by one or would you rather do (M-N+1)*numDigits?

But if we have a range where multiple numbers appear what do we do? Let's just find the intervals of same digits, and add sums of those intervals. Implemented below. I think that my findEndLimit could be further optimized with a lookup table.

Code

#include <stdio.h>
#include <limits.h>
#include <time.h>

unsigned int fastLog2(unsigned int v)
{
    static const char LogTable256[256] = 
    {
    #define LT(n) n, n, n, n, n, n, n, n, n, n, n, n, n, n, n, n
        -1, 0, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3,
        LT(4), LT(5), LT(5), LT(6), LT(6), LT(6), LT(6),
        LT(7), LT(7), LT(7), LT(7), LT(7), LT(7), LT(7), LT(7)
    };

    register unsigned int t, tt; // temporaries

    if (tt = v >> 16)
    {
      return (t = tt >> 8) ? 24 + LogTable256[t] : 16 + LogTable256[tt];
    }
    else 
    {
      return (t = v >> 8) ? 8 + LogTable256[t] : LogTable256[v];
    }
}

unsigned int numberOfBits(unsigned int i)
{
    if (i == 0) {
        return 1;
    }
    else {
        return fastLog2(i) + 1;
    }
}

unsigned int findEndLimit(unsigned int sx, unsigned int ex)
{
    unsigned int sy = numberOfBits(sx);
    unsigned int ey = numberOfBits(ex);
    unsigned int mx;
    unsigned int my;

    if (sy == ey) // this also means sx == ex
        return ex;

    // assumes sy < ey
    mx = (ex - sx) / 2 + sx; // will eq. sx for sx + 1 == ex
    my = numberOfBits(mx);
    while (ex - sx != 1) {
        mx = (ex - sx) / 2 + sx; // will eq. sx for sx + 1 == ex
        my = numberOfBits(mx);
        if (my == ey) {
            ex = mx;
            ey = numberOfBits(ex);
        }
        else {
            sx = mx;
            sy = numberOfBits(sx);
        }
    }
    return sx+1;
}

int main(void)
{
    unsigned int a, b, m;
    unsigned long l;
    clock_t start, end;
    l = 0;
    a = 0;
    b = UINT_MAX;

    start = clock();
    unsigned int i;
    for (i = a; i < b; ++i) {
        l += numberOfBits(i);
    }
    if (i == b) {
        l += numberOfBits(i);
    }
    end = clock();

    printf("Naive\n");
    printf("Digits: %ld; Time: %fs\n",l, ((double)(end-start))/CLOCKS_PER_SEC);

    l=0;
    start = clock();
    do {
        m = findEndLimit(a, b);
        l += (b-m + 1) * (unsigned long)numberOfBits(b);
        b = m-1;
    } while (b > a);
    l += (b-a+1) * (unsigned long)numberOfBits(b);
    end = clock();

    printf("Binary search\n");
    printf("Digits: %ld; Time: %fs\n",l, ((double)(end-start))/CLOCKS_PER_SEC);
}

Output

From 0 to UINT_MAX

$ ./main 
Naive
Digits: 133143986178; Time: 25.722492s
Binary search
Digits: 133143986178; Time: 0.000025s

My findEndLimit can take long time in some edge cases:

From UINT_MAX/16+1 to UINT_MAX/8

$ ./main 
Naive
Digits: 7784628224; Time: 1.651067s
Binary search
Digits: 7784628224; Time: 4.921520s

Algorithm

The main idea is to find the n2 = log2(x) rounded down. That is the number of digits in x . Let pow2 = 1 << n2 . n2 * (pow2 - x + 1) is the number of digits in the values [x...pow2] . Now find the sun of digits in the powers of 2 from 1 to n2-1

Code

I am certain various simplifications can be made.
Untested code . Will review later.

// Let us use unsigned for everything.

unsigned ulog2(unsigned value) {
  unsigned result = 0;
  if (0xFFFF0000u & value) {
    value >>= 16; result += 16;
  }
  if (0xFF00u & value) {
    value >>= 8; result += 8;
  }
  if (0xF0u & value) {
    value >>= 4; result += 4;
  }
  if (0xCu & value) {
    value >>= 2; result += 2;
  }
  if (0x2 & value) {
    value >>= 1; result += 1;
  }
  return result;
}

unsigned bit_count_helper(unsigned x) {
  if (x == 0) {
    return 1;
  }
  unsigned n2 = ulog2(x);
  unsigned pow2 = 1u << n;
  unsigned sum = n2 * (pow2 - x + 1u);  // value from pow2 to x
  while (n2 > 0) {
    // ... + 5*16 + 4*8 + 3*4 + 2*2 + 1*1
    pow2 /= 2;
    sum += n2 * pow2;
  }
  return sum;
}

unsigned bit_count(unsigned a, unsigned b) {
  assert(a < b);
  return bit_count_helper(b - 1) - bit_count_helper(a);
}

Conceptually, you would need to split the task to two subproblems - 1) find the sum of digits from 0..M, and from 0..N, then subtract.

2) find the floor(log2(x)), because eg for the number 77 the numbers 64,65,...77 all have 6 digits, the next 32 have 5 digits, the next 16 have 4 digits and so on, which makes a geometric progression.

Thus:

 int digits(int a) {
   if (a == 0) return 1;   // should digits(0) be 0 or 1 ?
   int b=(int)floor(log2(a));   // use any all-integer calculation hack
   int sum = 1 + (b+1) * (a- (1<<b) +1);  // added 1, due to digits(0)==1
   while (--b)
     sum += (b + 1) << b;   // shortcut for (b + 1) * (1 << b);
   return sum;
 }
 int digits_range(int a, int b) {
      if (a <= 0 || b <= 0) return -1;   // formulas work for strictly positive numbers
      return digits(b)-digits(a-1);
 }

As efficiency depends on the tools available, one approach would be doing it "analog":

#include <stdlib.h>
#include <stdio.h>
#include <math.h> 

unsigned long long pow2sum_min(unsigned long long n, long long unsigned m)
{
  if (m >= n)
  {
    return 1;
  }

  --n;

  return (2ULL << n) + pow2sum_min(n, m);
}

#define LN(x) (log2(x)/log2(M_E))

int main(int argc, char** argv)
{
  if (2 >= argc)
  {
    fprintf(stderr, "%s a b\n", argv[0]);
    exit(EXIT_FAILURE);
  }

  long a = atol(argv[1]), b = atol(argv[2]);

  if (0L >= a || 0L >= b || b < a)
  {
    puts("Na ...!");
    exit(EXIT_FAILURE);
  }

  /* Expand intevall to cover full dimensions: */
  unsigned long long a_c = pow(2, floor(log2(a)));
  unsigned long long b_c = pow(2, floor(log2(b+1)) + 1);

  double log2_a_c = log2(a_c);
  double log2_b_c = log2(b_c);

  unsigned long p2s = pow2sum_min(log2_b_c, log2_a_c) - 1;

  /* Integral log2(x) between a_c and b_c: */
  double A = ((b_c * (LN(b_c) - 1)) 
            - (a_c * (LN(a_c) - 1)))/LN(2)
            + (b+1 - a);

  /* "Integer"-integral - integral of log2(x)'s inverse function (2**x) between log(a_c) and log(b_c): */
  double D = p2s - (b_c - a_c)/LN(2);

  /* Corrective from a_c/b_c to a/b : */
  double C = (log2_b_c - 1)*(b_c - (b+1)) + log2_a_c*(a - a_c);

  printf("Total used digits: %lld\n", (long long) ((A - D - C) +.5));
}

:-)

The main thing here is the number and kind of iterations done.

Number is

log(floor(b_c)) - log(floor(a_c))

times

doing one

n - 1 /* Integer decrement  */
2**n + s /* One bit-shift and one integer addition  */

for each iteration.

The main thing to understand here is that the number of digits used to represent a number in binary increases by one with each power of two:

 +--------------+---------------+ | number range | binary digits | +==============+===============+ | 0 - 1 | 1 | +--------------+---------------+ | 2 - 3 | 2 | +--------------+---------------+ | 4 - 7 | 3 | +--------------+---------------+ | 8 - 15 | 4 | +--------------+---------------+ | 16 - 31 | 5 | +--------------+---------------+ | 32 - 63 | 6 | +--------------+---------------+ | ... | ... |

A trivial improvement over your brute force algorithm would then be to figure out how many times this number of digits has increased between the two numbers passed in (given by the base two logarithm) and add up the digits by multiplying the count of numbers that can be represented by the given number of digits (given by the power of two) with the number of digits.

A naive implementation of this algorithm is:

int digits_sum_seq(int a, int b)
{
    int sum = 0;
    int i = 0;
    int log2b = b <= 0 ? 1 : floor(log2(b));
    int log2a = a <= 0 ? 1 : floor(log2(a)) + 1;

    sum += (pow(2, log2a) - a) * (log2a);

    for (i = log2b; i > log2a; i--)
        sum += pow(2, i - 1) * i;

    sum += (b - pow(2, log2b) + 1) * (log2b + 1);

    return sum;
}

It can then be improved by the more efficient versions of the log and pow functions seen in the other answers.

Here's an entirely look-up based approach. You don't even need the log2 :)

Algorithm

First we precompute interval limits where the number of bits would change and create a lookup table. In other words we create an array limits[2^n] , where limits[i] gives us the biggest integer that can be represented with (i+1) bits. Our array is then {1, 3, 7, ..., 2^n-1} .

Then, when we want to determine the sum of bits for our range, we must first match our range limits a and b with the smallest index for which a <= limits[i] and b <= limits[j] holds, which will then tell us that we need (i+1) bits to represent a , and (j+1) bits to represent b .

If the indexes are the same, then the result is simply (b-a+1)*(i+1) , otherwise we must separately get the number of bits from our value to the edge of same number of bits interval, and add up total number of bits for each interval between as well. In any case, simple arithmetic.

Code

#include <stdio.h>
#include <limits.h>
#include <time.h>

unsigned long bitsnumsum(unsigned int a, unsigned int b)
{
    // generate lookup table
    // limits[i] is the max. number we can represent with (i+1) bits
    static const unsigned int limits[32] =
    {
    #define LTN(n) n*2u-1, n*4u-1, n*8u-1, n*16u-1, n*32u-1, n*64u-1, n*128u-1, n*256u-1
        LTN(1),
        LTN(256),
        LTN(256*256),
        LTN(256*256*256)
    };

    // make it work for any order of arguments
    if (b < a) {
        unsigned int c = a;
        a = b;
        b = c;
    }

    // find interval of a
    unsigned int i = 0;
    while (a > limits[i]) {
            ++i;
    }
    // find interval of b
    unsigned int j = i;
    while (b > limits[j]) {
            ++j;
    }

    // add it all up
    unsigned long sum = 0;
    if (i == j) {
        // a and b in the same range
        // conveniently, this also deals with j == 0
        // so no danger to do [j-1] below
        return (i+1) * (unsigned long)(b - a + 1);
    }
    else {
        // add sum of digits in range [a, limits[i]]
        sum += (i+1) * (unsigned long)(limits[i] - a + 1);
        // add sum of digits in range [limits[j], b]
        sum += (j+1) * (unsigned long)(b - limits[j-1]);
        // add sum of digits in range [limits[i], limits[j]]
        for (++i; i<j; ++i) {
            sum += (i+1) * (unsigned long)(limits[i] - limits[i-1]);
        }
        return sum;
    }
}

int main(void)
{
    clock_t start, end;
    unsigned int a=0, b=UINT_MAX;

    start = clock();
    printf("Sum of binary digits for numbers in range "
    "[%u, %u]: %lu\n", a, b, bitsnumsum(a, b));
    end = clock();
    printf("Time: %fs\n", ((double)(end-start))/CLOCKS_PER_SEC);
}

Output

$ ./lookup 
Sum of binary digits for numbers in range [0, 4294967295]: 133143986178
Time: 0.000282s

For this problem your solution is the simplest, the one called "naive" where you look for every element in the sequence or in your case interval for check something or execute operations.

Naive Algorithm

Assuming that a and b are positive integers with b greater than a let's call the dimension/size of the interval [a,b] , n = (ba) .

Having our number of elements n and using some notations of algorithms (like big-O notation link ), the worst case cost is O(n*(numberOfBits_cost)) .

From this we can see that we can speed up our algorithm by using a faster algorithm for computing numberOfBits() or we need to find a way to not look at every element of the interval that costs us n operations.

Intuition

Now looking at a possible interval [6,14] you can see that for 6 and 7 we need 3 digits, with 4 need for 8,9,10,11,12,13,14 . This results in calling numberOfBits( ) for every number that use the same number of digits to be represented, while the following multiplication operation would be faster:

(number_in_subinterval)*digitsForThisInterval
((14-8)+1)*4 = 28
((7-6)+1)*3 = 6

So we reduced the looping on 9 elements with 9 operations to only 2.

So writing a function that use this intuition will give us a more efficient in time, not necessarily in memory, algorithm. Using your numberOfBits() function I have created this solution:

   int intuitionSol(int a, int b){
    int digitsForA = numberOfBits(a);
    int digitsForB = numberOfBits(b);
    
    if(digitsForA != digitsForB){
        //because a or b can be that isn't the first or last element of the
        // interval that a specific number of digit can rappresent there is a need
        // to execute some correction operation before on a and b
        int tmp = pow(2,digitsForA)  - a;
        int result = tmp*digitsForA; //will containt the final result that will be returned
        
        int i;
        for(i = digitsForA + 1; i < digitsForB; i++){
            int interval_elements = pow(2,i) - pow(2,i-1);
            result = result + ((interval_elements) * i);
            //printf("NumOfElem: %i for %i digits; sum:= %i\n", interval_elements, i, result);
        }
        
        int tmp1 = ((b + 1) - pow(2,digitsForB-1));
        result = result + tmp1*digitsForB;
        return result;
    }
    else {
        int elements = (b - a) + 1;
        return elements * digitsForA; // or digitsForB
    }
}

Let's look at the cost, this algorithm costs is the cost of doing correction operation on a and b plus the most expensive one that of the for-loop. In my solution however I'm not looping over all elements but only on numberOfBits(b)-numberOfBits(a) that in the worst case, when [0,n] , become log(n)-1 thats equivalent to O(log n) . To resume we passed from a linear operations cost O(n) to a logartmic one O(log n) in the worst case. Look on this diagram the diferinces between the two.

Note

When I talk about interval or sub-interval I refer to the interval of elements that use the same number of digits to represent the number in binary. Following there are some output of my tests with the last one that shows the difference:

Considered interval is [0,4]
YourSol: 9 in time: 0.000015s
IntuitionSol: 9 in time: 0.000007s

Considered interval is [0,0]
YourSol: 1 in time: 0.000005s
IntuitionSol: 1 in time: 0.000005s

Considered interval is [4,7]
YourSol: 12 in time: 0.000016s
IntuitionSol: 12 in time: 0.000005s

Considered interval is [2,123456]
YourSol: 1967697 in time: 0.005010s
IntuitionSol: 1967697 in time: 0.000015s

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM