简体   繁体   中英

Total number of ways to write a positive integer as the sum of powers of 2 in efficient time

I've been looking at Number of ways to write n as a sum of powers of 2 and it works just fine, but I was wondering how to improve the run time efficiency of that algorithm. It fails to compute anything above ~1000 in any reasonable amount of time (under 10 seconds).

I'm assuming it has something to do with breaking it down into subproblems but don't know how to go about it. I was thinking something like O(n) or O(nlogn) runtime - I'm sure it is possible somehow. I just don't know how to split up the work efficiently.

code via Chasefornone

 #include<iostream>
using namespace std;

int log2(int n)
{
    int ret = 0;
    while (n>>=1) 
    {
        ++ret;      
    }
    return ret;
}

int power(int x,int y)
{
    int ret=1,i=0;
    while(i<y)
    {
        ret*=x;
        i++;
    }
    return ret;
}

int getcount(int m,int k)
{
    if(m==0)return 1;
    if(k<0)return 0;
    if(k==0)return 1;
    if(m>=power(2,k))return getcount(m-power(2,k),k)+getcount(m,k-1);
    else return getcount(m,k-1);

}

int main()
{
    int m=0;
    while(cin>>m)
    {
        int k=log2(m);
        cout<<getcount(m,k)<<endl;
    }
    return 0;
}

Since we're dealing with powers of some base (in this case 2), we can easily do it in O(n) time (and space, if we consider the counts of fixed size).

The key is the generating function of the partitions. Let p(n) be the number of ways to write n as a sum of powers of the base b .

Then consider

        ∞
f(X) =  ∑  p(n)*X^n
       n=0

One can write f as an infinite product,

        ∞
f(X) =  ∏  1/(1 - X^(b^k))
       k=0

and if one only wants the coefficients up to some limit l , one need only consider the factors with b^k <= l .

Multiplying them in the correct order (descending), at each step one knows that only coefficients whose index is divisible by b^i are nonzero, so on needs only n/b^k + n/b^(k-1) + ... + n/b + n additions of the coefficients, in total O(n) .

Code (not guarding against overflow for larger arguments):

#include <stdio.h>

unsigned long long partitionCount(unsigned n);

int main(void) {
    unsigned m;
    while(scanf("%u", &m) == 1) {
        printf("%llu\n", partitionCount(m));
    }
    return 0;
}

unsigned long long partitionCount(unsigned n) {
    if (n < 2) return 1;
    unsigned h = n /2, k = 1;
    // find largest power of two not exceeding n
    while(k <= h) k <<= 1;
    // coefficient array
    unsigned long long arr[n+1];
    arr[0] = 1;
    for(unsigned i = 1; i <= n; ++i) {
        arr[i] = 0;
    }
    while(k) {
        for(unsigned i = k; i <= n; i += k) {
            arr[i] += arr[i-k];
        }
        k /= 2;
    }
    return arr[n];
}

is working fast enough:

$ echo "1000 end" | time ./a.out
1981471878
0.00user 0.00system 0:00.00elapsed

A generally-applicable approach to problems like this is to cache the intermediate results, eg as follows:

#include <iostream>
#include <map>

using namespace std;

map<pair<int,int>,int> cache;

/* 
The log2() and power() functions remain unchanged and so are omitted for brevity
 */
int getcount(int m,int k)
{
    map<pair<int,int>, int>::const_iterator it = cache.find(make_pair(m,k));
    if (it != cache.end()) {
        return it->second;
    }
    int count = -1;
    if(m==0) {
       count = 1;
    } else if (k<0) {
        count = 0;
    } else if (k==0) {
       count = 1;
    } else if(m>=power(2,k)) {
        count = getcount(m-power(2,k),k)+getcount(m,k-1);
    } else {
        count = getcount(m,k-1);
    }
    cache[make_pair(m,k)] = count;
    return count;
}

/* 
The main() function remains unchanged and so is omitted for brevity
 */

The result for the original program (which I've called nAsSum0 ) is:

$ echo 1000 | time ./nAsSum0
1981471878
59.40user 0.00system 0:59.48elapsed 99%CPU (0avgtext+0avgdata 467200maxresident)k
0inputs+0outputs (1935major+0minor)pagefaults 0swaps

For the version with caching:

$ echo 1000 | time ./nAsSum
1981471878
0.01user 0.01system 0:00.09elapsed 32%CPU (0avgtext+0avgdata 466176maxresident)k
0inputs+0outputs (1873major+0minor)pagefaults 0swaps

... both run on a Windows 7 PC under Cygwin. Thus, the version with caching was too quick for time to measure accurately, whereas the original version took about 1 minute to run.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM