0/1 Knapsack Problem Simplified in Python

Question

I have the following code that is performing too slowly. The idea is similar to the 0/1 knapsack problem, you have a given integer n and you have to find numbers in range 1 to n - 1 that when squared add up to n squared.

For example if n is 5, then it should output 3 , 4 because 3 ** 2 and 4 ** 2 = (25 or 5 ** 2). I have been struggling to understand how to make this more efficient and would like to know the concepts used to improve the efficiency of this type of program.

Some other examples: n = 8 [None] n = 30 [1, 3, 7, 29] n = 16 [2, 3, 5, 7, 13]

I found some posts regarding this but they seemed limited to two numbers where as my program needs to use as many as it needs to add up to the original number.

I watched some videos on the 0/1 knapsack problem. I struggled to apply the same concepts to my own program as the issue was quite different. They had things they could put in their bag that had a weight and profit.

This has all been hurting my brain for a few hours and if anyone could even point me in the right direction I would appreciate it highly, thankyou :)

from math import sqrt
def decompose(n):

    lst = []

    sets = []

    temp = []

    perm = {}

    out = []

    for i in range (n):
        lst.append(i**2)


    for i in lst:
        for x in sets:
            temp.append(i + x)
            perm[i + x] = (i, x)
        for x in temp:
            if x not in sets:
                sets.append(x)
        if i not in sets:
            sets.append(i)
        temp = []

    if n**2 not in perm.keys():
        return None

    for i in perm[n**2]:
        if str(i).isdigit():
            out.append(i)
        if i == ' ':
            out.append(i)


    for i in out:
        if i not in lst:
            out.remove(i)
            for i in perm[i]:
                if str(i).isdigit():
                    out.append(i)
                if i == ' ':
                    out.append(i)

    out.sort()

    return [sqrt(i) for i in out]

Answer 1

This got too big for a comment, so I'm putting it here as an answer:

it's exactly 0/1 knapsack or a "coin change problem" (en.wikipedia.org/wiki/Change-making_problem). Your goal is to make 25 cents (if n = 5). Your "coins" are 1 cent, 4 cents, 9 cents, 16 cents, etc. I'm assuming that since you were looking at 0/1 knapsack, that you cannot re-use the same coin (if you can reuse the same coin, the problem is much simpler).

There are two approaches to dynamic programming problems like this. They are both intuitive in their own ways, but one may be more intuitive to you at the moment.

1.

The first is memoization (known as top-down). This is where you write a recursive function for decompose , but you cache the results of every call to decompose . the recursive formula here would be something like

decompose_cache = dictionary that stores results of calls to decompose
def decompose(n = 25, coins_to_use={1,4,9,16}):
  if (n, coins_to_use) in decompose_cache:
    return decompose_cache[(n, coins_to_use)]
  biggest_coin = max(coins_to_use)
  other_coins = coins_to_use - {biggest_coin}
  decomposition_with_biggest_coin = decompose(n-biggest_coin, other_coins)
  decomposition_without_biggest_coin = decompose(n, other_coins)
  ans = decomposition_with_biggest_coin or decomposition_without_biggest_coin
  decompose_cache[(n, coins_to_use)] = ans
  return ans
print(decompose(25, {1,4,9,16}))

That is, to determine if we can make 25 cents using {1,4,9,16}, we merely need to check if we can make 25 cents using {1,4,9} OR if we can make 9 cents (25 - 16) using {1,4,9}. This recursive definition, if we didn't cache results from each call, would result in something like O(n^n) function calls, but since we cache the results, we only ever do the computation for some (goal, coins) pair at most once. There are n^2 possible goals, and n possible sets of coins, so there are n^2 * n pairs, and so there are O(n^2 * n = n^3) function calls.

2.

The second approach is dynamic programming (known as bottom-up). (I personally think this is simpler to think about, and you won't run into maximum recursion depth issues in python)

This is where you fill up a table, starting from the empty base case, where the value of an entry in the table can be calculated by looking at the values of the entries already filled-in. We can call the table "DP".
Here, we can build a table where DP[n][k] is true if you can sum to a value of n by using only the first k "coins" (where the 1st coin is 1, the 2nd coin is 4, etc).

The way we can calculate the value of a cell in the table is:
DP[n][k] = DP[n - kth coin][k-1] OR DP[n][k-1]

The logic is the same as above: we can make change for 5 cents, with the coins {1,4} (the first two coins) if and only if we can make change for 1 cent (5-4) using {1} (the first coin) or if we can make change for 5 cents using {1}. So, DP[5][2] = DP[1][1] or DP[5][1]. Again, there are n^3 entries to this table. You can fill it up row-by-row from [0][0] to [0][5] and then each row from [0][...] to [25][...] and the answer will be in [25][5].

Answer 2

Here is a recursive program to find a decomposition. The speed probably is not optimal. Certainly it is not optimal for searching large ranges of inputs, as the current approach doesn't cache intermediate results.

In this version of the function find_decomposition(n, k, uptonow, used) tries to find a decomposition for n ² only using the numbers from k to n-1 , while we already have used the set of used numbers, and these numbers give a partial sum of uptonow . The function recursively tries 2 possibilities: either the solution includes k itself, or it doesn't include k . First try one possibility, if it works, return it. If not, try the other way. So, first try out a solution without using k . If it didn't work out, do a quick test to see whether only using k gives a solution. And if that also didn't work out, recursively try a solution that uses k , thus for which the set of used numbers now include k of for which the sum uptonow needs to be increased by k ² .

Many variations can be thought of:

Instead of running from 1 to n-1 , k could run in the reverse order. Be careful with the test-conditions of the if-test.
Instead of first trying a solution that doesn't include k , start with trying a solution that does include k .

Note that for large n , the function can run into maximal recursion depth. Eg when n=1000 , there are about 2 ⁹⁹⁹ possible subsets of the numbers to be recursively checked. This can lead to a recursion of 999 levels deep, which at some point is too much for the Python interpreter to handle.

Probably the approach of first using up high numbers can be beneficial, as it quickly reduces the gap to fill. Luckily for large numbers there exist many possible solutions, so a solution can be found quickly. Note that in the general knapsack problem as described by @Kevin Wang, if no solutions exist, any approach with 999 numbers will take too long to finish.

def find_decomposition(n, k=1, uptonow=0, used=[]):

    # first try without k
    if k < n-1:
        decomp = find_decomposition(n, k+1, uptonow, used)
        if decomp is not None:
            return decomp

    # now try including k
    used_with_k = used + [k]
    if uptonow + k * k == n * n:
        return used_with_k
    elif k < n-1 and uptonow + k * k + (k+1)*(k+1) <= n * n:
        # no need to try k if k doesn't fit together with at least one higher number
        return find_decomposition(n, k+1, uptonow+k*k, used_with_k)
    return None

for n in range(5,1001):
    print(n, find_decomposition(n))

Output:

5 [3, 4]
6 None
7 [2, 3, 6]
8 None
9 [2, 4, 5, 6]
10 [6, 8]
11 [2, 6, 9]
12 [1, 2, 3, 7, 9]
13 [5, 12]
14 [4, 6, 12]
15 [9, 12]
16 [3, 4, 5, 6, 7, 11]
...

PS: This link contains code about a related problem, but where squares can be repeated: https://www.geeksforgeeks.org/minimum-number-of-squares-whose-sum-equals-to-given-number-n/

0/1 Knapsack Problem Simplified in Python

Question

2 answers

solution1
2 2019-11-27 02:10:04

1.

2.

solution2
1 ACCPTED 2019-11-27 01:17:05

0/1 Knapsack Problem Simplified in Python

Question

2 answers

solution1 2 2019-11-27 02:10:04

1.

2.

solution2 1 ACCPTED 2019-11-27 01:17:05

solution1
2 2019-11-27 02:10:04

solution2
1 ACCPTED 2019-11-27 01:17:05