Algorithm for finding subset out of list that fulfills constraint

Question

I'm looking for an efficient algorithm to perform the following task. An implementation in python would be optimal, but using another language or just pseudocode would be helpful as well.

Given is a rational number x and a list of about 600 rational numbers. Out of that list i want to find a subset of 10 numbers (no repetitions) that fulfills the constraint

s = the sum of x in addition to the sum of the numbers in the subset

so that the absolute difference between s and the nearest integer number is less than 10e-12.

I guess that this might be done with some kind of graph algorithm but i have no idea how to do it.

Example: This is a naive approach to illustrate what i am looking for. Obviously this approach will be to inefficient for the amount of possible combinations that result out of a list and subset as large as given in the description:

#!/bin/python3

import sys
from math import pow
from itertools import combinations

x = 0.5
list = [1.000123,2.192,3.2143124,1.00041,2.0043,3.5]
for c in combinations(list,3):
    s = x + sum(c)
    r = s%1    
    if r <= pow(10,-3) or r >= 1-pow(10,-3):
        print("Found combination: s=%s for %s"%(s,c))
        break

Example output:

Found combination: s=6.000533 for (1.000123, 1.00041, 3.5)

Answer 1

This problem can be formulated as a mathematical program, which enables us to solve it with specialized algorithms like the simplex algorithm in combination with branch and bound. In most cases with branch and bound we only have to look at a fraction of the search tree to find an optimal solution.

In Python there are many libraries for formulating and solving mixed-integer programs. I think CyLP and PuLP are among the well known free ones.

easy_install pulp

PuLP comes with a free solver for integer problems setup if you easy install it, so I recommend using PuLP, if it suffices for your problem.

Here is an implementation of your problem for PuLP:

from pulp import *

items = [1.000123,2.192,3.2143124,1.00041,2.0043,3.5]
x = 0.5
tol = 0.001
numstouse = 3

delta = pulp.LpVariable.dicts('listitems', range(len(items)), lowBound = 0, upBound = 1, cat = pulp.LpInteger)
s = pulp.LpVariable('value', lowBound = 0, cat = pulp.LpInteger)
r = pulp.LpVariable('deviation', lowBound = -tol, upBound=tol)

# Model formulation
prob = LpProblem("Sum Problem", LpMinimize)

# Constraints
prob += lpSum([delta[i] * items[i] for i in range(len(items))]) + x + r == s
prob += lpSum([delta[i] for i in range(len(items))]) == numstouse
prob.solve()
print("Found combination: s={} for {}".format(s.value(), tuple(items[i] for i in range(len(items)) if delta[i].value() == 1)))
if LpStatus[prob.status] == "Optimal":
    for i in delta.keys():
        if delta[i].value() == 1:
            print("List item {} is used in the sum with value {}.".format(i, items[i]))
else:
    print("The problem seems to be infeasible.")

It is important to note that, while PuLP will solve your problem eventually with the default free solver, commercial solvers are many many times faster than free ones . With growing problem size, free solvers may be way slower than commercial ones.

Answer 2

Here is a heuristic solution using only numpy.

UPDATE: Please forgive my bragging, but be sure not to miss the very end of this post where we solve for k = 80 terms out of N = 1200 numbers at 10^-12 accuracy. The solver finds more than 100 solutions in just above 4 seconds on modest hardware.

Algorithm (for the N = 600, k = 10, eps = 10^-12 case):

It takes advantage of there statistically being lots and lots of solutions, and samples only a manageable subspace.

Indeed, if samples are evenly distributed it suffices to randomly test 4x10^12 sums for a >99.9% chance of finding a solution. This can be brought to tractable levels by splitting into two sets of 2x10^6 halfsums because one can then avoid computing most of the pairwise sums using a trick which only involves sorting 4x10^6 numbers, which can easily be done on current hardware.

This is how it works.

It chooses 10 disjoint random subsamples of 30 (can go down to 20 and still be pretty sure to find solutions), splits them in two and computes all 30**5 sums for each half. Then one half is subtracted from minus the input x. Everything is then reduced modulo 1 and sorted.

Among the differences between consecutive elements are typically a good 2,000 below the tolerance of 10^-12, half of which are between sums from the different halfs. All these are solutions.

Most of the complexity of the code is owed to tracing back the indirect sort.

import numpy as np
import time

def binom(N, k):
    return np.prod(np.arange(N, N-k, -1).astype(object)) \
        // np.prod(np.arange(2, k+1).astype(object))

def master(nlist, input, k=10, HS=10**7, eps=12, ntrials=10):
    for j in range(ntrials):
        res = trial(nlist, input, k=k, HS=HS, eps=eps)
        if not res is None:
            return res
     print("No solution found in", ntrials, "trials.")

def trial(nlist, input, k=10, HS=10**7, eps=12):
    tol = 10**-eps
    srps = str(eps)
    t0 = time.time()
    N = len(nlist)
    if 2**(k//2) > HS or k > 64:
        kk = min(2 * int (np.log(HS) / np.log(2)), 64)
    else:
        kk = k        
    kA, kB = (kk+1)//2, kk//2
    CA = min(int(HS**(1/kA)), (N+kk-k) // (kA+kB))
    CB = min(int(HS**(1/kB)), (N+kk-k) // (kA+kB))
    inds = np.random.permutation(N)
    indsA = np.reshape(inds[:kA*CA], (kA, CA))
    indsB = np.reshape(inds[kA*CA:kA*CA+kB*CB], (kB, CB))
    extra = inds[N-k+kk:]
    A = sum(np.ix_(*tuple(nlist[indsA]))).ravel() % 1
    B = (-input - nlist[extra].sum()
         - sum(np.ix_(*tuple(nlist[indsB]))).ravel()) % 1
    AB = np.r_[A, B]
    ABi = np.argsort(AB)
    AB = np.where(np.diff(AB[ABi]) < tol)[0]
    nsol = len(AB)
    if nsol == 0:
        return None
     # translate back ...
    ABl = ABi[AB]
    ABh = ABi[AB+1]
    ABv = (ABl >= CA**kA) != (ABh >= CA**kA)
    nsol = np.count_nonzero(ABv)
    if nsol == 0:
        return None
    ABl, ABh = ABl[ABv], ABh[ABv]
    Ai = np.where(ABh >= CA**kA, ABl, ABh)
    Bi = np.where(ABh < CA**kA, ABl, ABh) - CA**kA
    Ai = np.unravel_index(Ai, kA * (CA,))
    Bi = np.unravel_index(Bi, kB * (CB,))
    solutions = [np.r_[indsA[np.arange(kA), Aii],
                       indsB[np.arange(kB), Bii], extra]
                 for Aii, Bii in zip(np.c_[Ai], np.c_[Bi])]
    total_time = time.time() - t0
    for sol in solutions:
        print(("{:."+srps+"f}  =  {:."+srps+"f}  " + "\n".join([
            j * (" + {:."+srps+"f}") for j
            in np.diff(np.r_[0, np.arange(4, k, 6), k])])).format(
                   nlist[sol].sum() + input, input, *nlist[sol]))
    print("\n{} solutions found in {:.3f} seconds, sampling {:.6g}% of"
          " available space.".format(nsol, total_time,
                                     100 * (CA**kA + CB**kB) / binom(N, k)))
    return solutions

Output:

a = np.random.random(600)
b = np.random.random()
s = trial(a, b)
...
<  --   snip   --  >
...
5.000000000000  =  0.103229509601  + 0.006642137376 + 0.312241735755 + 0.784266426461 + 0.902345822935 + 0.988978878589
 + 0.973861938944 + 0.191460799437 + 0.131957251738 + 0.010524218878 + 0.594491280285
5.999999999999  =  0.103229509601  + 0.750882954181 + 0.365709602773 + 0.421458098864 + 0.767072742224 + 0.689495123832
 + 0.654006237725 + 0.418856927051 + 0.892913889958 + 0.279342774349 + 0.657032139442
6.000000000000  =  0.103229509601  + 0.765785564962 + 0.440313432133 + 0.987713329856 + 0.785837107607 + 0.018125214584
 + 0.742834214592 + 0.820268051141 + 0.232822918386 + 0.446038517697 + 0.657032139442
5.000000000001  =  0.103229509601  + 0.748677981958 + 0.708845535002 + 0.330115345473 + 0.660387831821 + 0.549772082712
 + 0.215300958403 + 0.820268051141 + 0.258387204727 + 0.010524218878 + 0.594491280285
5.000000000001  =  0.103229509601  + 0.085365104308 + 0.465618675355 + 0.197311784789 + 0.656004057436 + 0.595032922699
 + 0.698000899403 + 0.546925212167 + 0.844915369567 + 0.333326991548 + 0.474269473129

1163 solutions found in 18.431 seconds, sampling 0.000038% of available space.

Since only simple operations are used we are essentially only limited by floating point accuracy. So let's ask for 10^-14:

...
<  --   snip   --  >
...
6.00000000000000  =  0.25035941161807  + 0.97389388071258 + 0.10625051346950 + 0.59833873712725 + 0.89897827417947
 + 0.78865856416474 + 0.35381392162358 + 0.87346871541364 + 0.53658653353249 + 0.21248261924724 + 0.40716882891145
5.00000000000000  =  0.25035941161807  + 0.24071288846314 + 0.48554094441439 + 0.50713200488770 + 0.38874292843933
 + 0.86313933327877 + 0.90048328572856 + 0.49027844783527 + 0.23879340585229 + 0.10277432242557 + 0.53204302705691
5.00000000000000  =  0.25035941161807  + 0.38097649901116 + 0.48554094441439 + 0.46441170824601 + 0.62826547862002
 + 0.86313933327877 + 0.33939826575779 + 0.73873418282621 + 0.04398883198337 + 0.62252491844691 + 0.18266042579730
3.00000000000000  =  0.25035941161807  + 0.06822167273996 + 0.23678340695986 + 0.46441170824601 + 0.08855356615846
 + 0.00679943782685 + 0.74823208211878 + 0.56709685813503 + 0.44549706663049 + 0.05232395855097 + 0.07172083101554
4.99999999999999  =  0.25035941161807  + 0.02276077008953 + 0.29734365315824 + 0.74952397467956 + 0.74651313615300
 + 0.06942795892486 + 0.33939826575779 + 0.28515053127059 + 0.75198496353405 + 0.95549430775741 + 0.53204302705691
6.00000000000000  =  0.25035941161807  + 0.87635507011986 + 0.24113470302798 + 0.37942029808604 + 0.08855356615846
 + 0.30383588785334 + 0.79224372764376 + 0.85138208150978 + 0.76217062127440 + 0.76040834996762 + 0.69413628274069
5.00000000000000  =  0.25035941161807  + 0.06822167273996 + 0.51540640390940 + 0.91798512102932 + 0.63568890016512
 + 0.75300966489960 + 0.30826232152132 + 0.54179156374890 + 0.30349257203507 + 0.63406153731771 + 0.07172083101554

11 solutions found in 18.397 seconds, sampling 0.000038% of available space.

Or we can reduce the number of samples for a faster execution time:

...
<  --   snip   -- >
...
4.999999999999  =  0.096738768432  + 0.311969906774 + 0.830155028676 + 0.164375548024 + 0.118447437942
 + 0.362452121111 + 0.676458354204 + 0.627931895727 + 0.568131437959 + 0.579341106837 + 0.663998394313
5.000000000000  =  0.096738768432  + 0.682823940439 + 0.768308425728 + 0.290242415733 + 0.303087635772
 + 0.776829608333 + 0.229947280121 + 0.189745700730 + 0.469824524584 + 0.795706660727 + 0.396745039400
6.000000000000  =  0.096738768432  + 0.682823940439 + 0.219502575013 + 0.164375548024 + 0.853518966685
 + 0.904544718964 + 0.272487275000 + 0.908201512199 + 0.570219149773 + 0.840338947058 + 0.487248598411
6.000000000001  =  0.096738768432  + 0.838905554517 + 0.837179741796 + 0.655925596548 + 0.121227619542
 + 0.393276631434 + 0.529706372738 + 0.627931895727 + 0.857852927706 + 0.827365021028 + 0.213889870533
5.000000000000  =  0.096738768432  + 0.037789824744 + 0.219502575013 + 0.578848374222 + 0.618570311975
 + 0.393356108716 + 0.999687645216 + 0.163539900985 + 0.734447052985 + 0.840338947058 + 0.317180490652
5.000000000001  =  0.096738768432  + 0.093352607179 + 0.600306836676 + 0.914256455483 + 0.618570311975
 + 0.759417445766 + 0.252660056506 + 0.422864494209 + 0.298221673761 + 0.456362751604 + 0.487248598411

25 solutions found in 1.606 seconds, sampling 0.000001% of available space.

Finally, it scales easily:

N = 1200; a = np.random.random(N)
b = np.random.random()
k = 80; s = nt6.trial(a, b, k)

Output:

...
<  --   snip   --  >
...
37.000000000000  =  0.189587827991   + 0.219870655535 + 0.422462560363 + 0.446529942912 + 0.340513300967
 + 0.272272603670 + 0.701821613150 + 0.016414376458 + 0.228845802410 + 0.071882553217 + 0.966675626054
 + 0.947578041095 + 0.016404068780 + 0.010927217220 + 0.160372498474 + 0.498852167218 + 0.018622555121
 + 0.199963779290 + 0.977205343235 + 0.272323870374 + 0.468492667326 + 0.405511314584 + 0.091160625930
 + 0.243752782720 + 0.563265391730 + 0.938591630157 + 0.053376502849 + 0.176084585660 + 0.212015784524
 + 0.093291552095 + 0.272949310717 + 0.697415829563 + 0.296772790257 + 0.302205095562 + 0.928446954142
 + 0.033615064623 + 0.038778684994 + 0.743281078457 + 0.931343341817 + 0.995992351352 + 0.803282407390
 + 0.714717982763 + 0.002658373156 + 0.366005349525 + 0.569351286490 + 0.515456813437 + 0.193641742784
 + 0.188781686796 + 0.622488518613 + 0.632796984155 + 0.343964602031 + 0.494069912343 + 0.891150139880
 + 0.526788287274 + 0.066698500327 + 0.236622057166 + 0.249176977739 + 0.881250574063 + 0.940333075706
 + 0.936703186575 + 0.400023784940 + 0.875090761246 + 0.485734931256 + 0.281568612107 + 0.493793875212
 + 0.021540268393 + 0.576960812516 + 0.330968114316 + 0.814755318215 + 0.964632238890 + 0.252849647521
 + 0.328316150100 + 0.831418052792 + 0.474425361099 + 0.877461270445 + 0.720632491736 + 0.719074649194
 + 0.698827578293 + 0.378885181918 + 0.661859236288 + 0.169773462717

119 solutions found in 4.039 seconds, sampling 8.21707e-118% of available space.

Note that only 46 of the numbers in the sum were computed the other 34 the algorithm chose randomly beforehand.

Answer 3

It's not as good as I expected, but I have spent far too much time on this to not share it

My idea was that you could reduce the problem from O(n^10) to O(n^5) by splitting the subset of size 10 in 2.

First, compute all subsets of size 5, then sort them by their sum modulo 1 (so that the sum is between 0 and 1)

Then an answer consists of adding two subsets of size 5 such that :

these subsets do not intersect
the total sum is either <e , >1-e and <1+e or >2-e with e being 10^-12 in your case

Each one of these 3 checks is really cheap if you iterate over the subsets of size 5 smartly (really, that's the part that brings O(n^10) to O(n^5))

So yes, this solution is O(n^5). The problem is that my solution :

is in Python, which is not the fastest language around
still requires to compute all combinations of size 5, and "Choose 5 among 600" is untractable (637262850120 combinations)

Edit : I just grab random subsets of size 40 of the large list of size 600 and test all the combinations. If it doesn't work, I grab a new random subset and do it again. It could be improved by being multithreaded

The output was found for step==44 after 20 minutes and is :

l = [0.06225774829854913, 0.21267040355189515, 0.21954445729288707, 0.21954445729288707, 0.24621125123532117, 0.24621125123532117, 0.36931687685298087, 0.4017542509913792, 0.41421356237309515, 0.41640786499873883, 0.6619037896906015]
>>> sum(l) + 0.529964086141668
3.999999999955324

Watch out for the hack that skips the first 43 steps, comment it if you want to do a real computation

from math import pow
from itertools import combinations
import itertools
import random
import time

# Constants
random.seed(1)
listLength = 40
halfsize = 5
halfsize2 = 6
x = 0.529964086141668
epsilon = pow(10,-10)

# Define your list here
#myList = [random.random() for i in range(listLength)]
items = [0.9705627484771391, 0.2788205960997061, 0.620499351813308, 0.0, 0.4222051018559565, 0.892443989449804, 0.41640786499873883, 0.0, 0.6491106406735181, 0.36931687685298087, 0.16552506059643868, 0.04159457879229578, 0.0, 0.04159457879229578, 0.16552506059643868, 0.36931687685298087, 0.6491106406735181, 0.0, 0.41640786499873883, 0.892443989449804, 0.4222051018559565, 0.0, 0.620499351813308, 0.2788205960997061, 0.9705627484771391, 0.2788205960997061, 0.556349186104045, 0.866068747318506, 0.21267040355189515, 0.6014705087354439, 0.038404810405298306, 0.5299640861416677, 0.08304597359457233, 0.7046999107196257, 0.4017542509913792, 0.18033988749894903, 0.045361017187261155, 0.0, 0.045361017187261155, 0.18033988749894903, 0.4017542509913792, 0.7046999107196257, 0.08304597359457233, 0.5299640861416677, 0.038404810405298306, 0.6014705087354439, 0.21267040355189515, 0.866068747318506, 0.556349186104045, 0.2788205960997061, 0.620499351813308, 0.866068747318506, 0.142135623730951, 0.45362404707370985, 0.8062484748656971, 0.20655561573370207, 0.6619037896906015, 0.18033988749894903, 0.7703296142690075, 0.4403065089105507, 0.19803902718556898, 0.049875621120889946, 0.0, 0.049875621120889946, 0.19803902718556898, 0.4403065089105507, 0.7703296142690075, 0.18033988749894903, 0.6619037896906015, 0.20655561573370207, 0.8062484748656971, 0.45362404707370985, 0.142135623730951, 0.866068747318506, 0.620499351813308, 0.0, 0.21267040355189515, 0.45362404707370985, 0.7279220613578552, 0.04159457879229578, 0.4017542509913792, 0.8166538263919687, 0.2956301409870008, 0.8488578017961039, 0.4868329805051381, 0.21954445729288707, 0.0553851381374173, 0.0, 0.0553851381374173, 0.21954445729288707, 0.4868329805051381, 0.8488578017961039, 0.2956301409870008, 0.8166538263919687, 0.4017542509913792, 0.04159457879229578, 0.7279220613578552, 0.45362404707370985, 0.21267040355189515, 0.0, 0.4222051018559565, 0.6014705087354439, 0.8062484748656971, 0.04159457879229578, 0.31370849898476116, 0.63014581273465, 0.0, 0.43398113205660316, 0.9442719099991592, 0.5440037453175304, 0.24621125123532117, 0.06225774829854913, 0.0, 0.06225774829854913, 0.24621125123532117, 0.5440037453175304, 0.9442719099991592, 0.43398113205660316, 0.0, 0.63014581273465, 0.31370849898476116, 0.04159457879229578, 0.8062484748656971, 0.6014705087354439, 0.4222051018559565, 0.892443989449804, 0.038404810405298306, 0.20655561573370207, 0.4017542509913792, 0.63014581273465, 0.8994949366116654, 0.21954445729288707, 0.6023252670426267, 0.06225774829854913, 0.6157731058639087, 0.28010988928051805, 0.0710678118654755, 0.0, 0.0710678118654755, 0.28010988928051805, 0.6157731058639087, 0.06225774829854913, 0.6023252670426267, 0.21954445729288707, 0.8994949366116654, 0.63014581273465, 0.4017542509913792, 0.20655561573370207, 0.038404810405298306, 0.892443989449804, 0.41640786499873883, 0.5299640861416677, 0.6619037896906015, 0.8166538263919687, 0.0, 0.21954445729288707, 0.48528137423856954, 0.810249675906654, 0.21110255092797825, 0.7082039324993694, 0.32455532033675905, 0.08276253029821934, 0.0, 0.08276253029821934, 0.32455532033675905, 0.7082039324993694, 0.21110255092797825, 0.810249675906654, 0.48528137423856954, 0.21954445729288707, 0.0, 0.8166538263919687, 0.6619037896906015, 0.5299640861416677, 0.41640786499873883, 0.0, 0.08304597359457233, 0.18033988749894903, 0.2956301409870008, 0.43398113205660316, 0.6023252670426267, 0.810249675906654, 0.0710678118654755, 0.40312423743284853, 0.8309518948453007, 0.38516480713450374, 0.09901951359278449, 0.0, 0.09901951359278449, 0.38516480713450374, 0.8309518948453007, 0.40312423743284853, 0.0710678118654755, 0.810249675906654, 0.6023252670426267, 0.43398113205660316, 0.2956301409870008, 0.18033988749894903, 0.08304597359457233, 0.0, 0.6491106406735181, 0.7046999107196257, 0.7703296142690075, 0.8488578017961039, 0.9442719099991592, 0.06225774829854913, 0.21110255092797825, 0.40312423743284853, 0.6568542494923806, 0.0, 0.4721359549995796, 0.12310562561766059, 0.0, 0.12310562561766059, 0.4721359549995796, 0.0, 0.6568542494923806, 0.40312423743284853, 0.21110255092797825, 0.06225774829854913, 0.9442719099991592, 0.8488578017961039, 0.7703296142690075, 0.7046999107196257, 0.6491106406735181, 0.36931687685298087, 0.4017542509913792, 0.4403065089105507, 0.4868329805051381, 0.5440037453175304, 0.6157731058639087, 0.7082039324993694, 0.8309518948453007, 0.0, 0.24264068711928477, 0.6055512754639891, 0.16227766016837952, 0.0, 0.16227766016837952, 0.6055512754639891, 0.24264068711928477, 0.0, 0.8309518948453007, 0.7082039324993694, 0.6157731058639087, 0.5440037453175304, 0.4868329805051381, 0.4403065089105507, 0.4017542509913792, 0.36931687685298087, 0.16552506059643868, 0.18033988749894903, 0.19803902718556898, 0.21954445729288707, 0.24621125123532117, 0.28010988928051805, 0.32455532033675905, 0.38516480713450374, 0.4721359549995796, 0.6055512754639891, 0.8284271247461903, 0.2360679774997898, 0.0, 0.2360679774997898, 0.8284271247461903, 0.6055512754639891, 0.4721359549995796, 0.38516480713450374, 0.32455532033675905, 0.28010988928051805, 0.24621125123532117, 0.21954445729288707, 0.19803902718556898, 0.18033988749894903, 0.16552506059643868, 0.04159457879229578, 0.045361017187261155, 0.049875621120889946, 0.0553851381374173, 0.06225774829854913, 0.0710678118654755, 0.08276253029821934, 0.09901951359278449, 0.12310562561766059, 0.16227766016837952, 0.2360679774997898, 0.41421356237309515, 0.0, 0.41421356237309515, 0.2360679774997898, 0.16227766016837952, 0.12310562561766059, 0.09901951359278449, 0.08276253029821934, 0.0710678118654755, 0.06225774829854913, 0.0553851381374173, 0.049875621120889946, 0.045361017187261155, 0.04159457879229578, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.04159457879229578, 0.045361017187261155, 0.049875621120889946, 0.0553851381374173, 0.06225774829854913, 0.0710678118654755, 0.08276253029821934, 0.09901951359278449, 0.12310562561766059, 0.16227766016837952, 0.2360679774997898, 0.41421356237309515, 0.0, 0.41421356237309515, 0.2360679774997898, 0.16227766016837952, 0.12310562561766059, 0.09901951359278449, 0.08276253029821934, 0.0710678118654755, 0.06225774829854913, 0.0553851381374173, 0.049875621120889946, 0.045361017187261155, 0.04159457879229578, 0.16552506059643868, 0.18033988749894903, 0.19803902718556898, 0.21954445729288707, 0.24621125123532117, 0.28010988928051805, 0.32455532033675905, 0.38516480713450374, 0.4721359549995796, 0.6055512754639891, 0.8284271247461903, 0.2360679774997898, 0.0, 0.2360679774997898, 0.8284271247461903, 0.6055512754639891, 0.4721359549995796, 0.38516480713450374, 0.32455532033675905, 0.28010988928051805, 0.24621125123532117, 0.21954445729288707, 0.19803902718556898, 0.18033988749894903, 0.16552506059643868, 0.36931687685298087, 0.4017542509913792, 0.4403065089105507, 0.4868329805051381, 0.5440037453175304, 0.6157731058639087, 0.7082039324993694, 0.8309518948453007, 0.0, 0.24264068711928477, 0.6055512754639891, 0.16227766016837952, 0.0, 0.16227766016837952, 0.6055512754639891, 0.24264068711928477, 0.0, 0.8309518948453007, 0.7082039324993694, 0.6157731058639087, 0.5440037453175304, 0.4868329805051381, 0.4403065089105507, 0.4017542509913792, 0.36931687685298087, 0.6491106406735181, 0.7046999107196257, 0.7703296142690075, 0.8488578017961039, 0.9442719099991592, 0.06225774829854913, 0.21110255092797825, 0.40312423743284853, 0.6568542494923806, 0.0, 0.4721359549995796, 0.12310562561766059, 0.0, 0.12310562561766059, 0.4721359549995796, 0.0, 0.6568542494923806, 0.40312423743284853, 0.21110255092797825, 0.06225774829854913, 0.9442719099991592, 0.8488578017961039, 0.7703296142690075, 0.7046999107196257, 0.6491106406735181, 0.0, 0.08304597359457233, 0.18033988749894903, 0.2956301409870008, 0.43398113205660316, 0.6023252670426267, 0.810249675906654, 0.0710678118654755, 0.40312423743284853, 0.8309518948453007, 0.38516480713450374, 0.09901951359278449, 0.0, 0.09901951359278449, 0.38516480713450374, 0.8309518948453007, 0.40312423743284853, 0.0710678118654755, 0.810249675906654, 0.6023252670426267, 0.43398113205660316, 0.2956301409870008, 0.18033988749894903, 0.08304597359457233, 0.0, 0.41640786499873883, 0.5299640861416677, 0.6619037896906015, 0.8166538263919687, 0.0, 0.21954445729288707, 0.48528137423856954, 0.810249675906654, 0.21110255092797825, 0.7082039324993694, 0.32455532033675905, 0.08276253029821934, 0.0, 0.08276253029821934, 0.32455532033675905, 0.7082039324993694, 0.21110255092797825, 0.810249675906654, 0.48528137423856954, 0.21954445729288707, 0.0, 0.8166538263919687, 0.6619037896906015, 0.41640786499873883, 0.892443989449804, 0.038404810405298306, 0.20655561573370207, 0.4017542509913792, 0.63014581273465, 0.8994949366116654, 0.21954445729288707, 0.6023252670426267, 0.06225774829854913, 0.6157731058639087, 0.28010988928051805, 0.0710678118654755, 0.0, 0.0710678118654755, 0.28010988928051805, 0.6157731058639087, 0.06225774829854913, 0.6023252670426267, 0.21954445729288707, 0.8994949366116654, 0.63014581273465, 0.4017542509913792, 0.20655561573370207, 0.038404810405298306, 0.892443989449804, 0.4222051018559565, 0.6014705087354439, 0.8062484748656971, 0.04159457879229578, 0.31370849898476116, 0.63014581273465, 0.0, 0.43398113205660316, 0.9442719099991592, 0.5440037453175304, 0.24621125123532117, 0.06225774829854913, 0.0, 0.06225774829854913, 0.24621125123532117, 0.5440037453175304, 0.9442719099991592, 0.43398113205660316, 0.0, 0.63014581273465, 0.31370849898476116, 0.04159457879229578, 0.8062484748656971, 0.6014705087354439, 0.4222051018559565, 0.0, 0.21267040355189515, 0.45362404707370985, 0.7279220613578552, 0.04159457879229578, 0.4017542509913792, 0.8166538263919687, 0.2956301409870008, 0.8488578017961039, 0.4868329805051381, 0.21954445729288707, 0.0553851381374173, 0.0, 0.0553851381374173, 0.21954445729288707, 0.4868329805051381, 0.8488578017961039, 0.2956301409870008, 0.8166538263919687, 0.4017542509913792, 0.04159457879229578, 0.7279220613578552, 0.45362404707370985, 0.21267040355189515, 0.0, 0.620499351813308, 0.866068747318506, 0.142135623730951, 0.45362404707370985, 0.8062484748656971, 0.20655561573370207, 0.6619037896906015, 0.18033988749894903, 0.7703296142690075, 0.4403065089105507, 0.19803902718556898, 0.049875621120889946, 0.0, 0.049875621120889946, 0.19803902718556898, 0.4403065089105507, 0.7703296142690075, 0.18033988749894903, 0.6619037896906015, 0.20655561573370207, 0.8062484748656971, 0.45362404707370985, 0.142135623730951, 0.866068747318506, 0.620499351813308, 0.2788205960997061, 0.556349186104045, 0.866068747318506, 0.21267040355189515, 0.6014705087354439, 0.038404810405298306, 0.5299640861416677, 0.08304597359457233, 0.7046999107196257, 0.4017542509913792, 0.18033988749894903, 0.045361017187261155, 0.0, 0.045361017187261155, 0.18033988749894903, 0.4017542509913792, 0.7046999107196257, 0.08304597359457233, 0.5299640861416677, 0.038404810405298306, 0.6014705087354439, 0.21267040355189515, 0.866068747318506, 0.556349186104045, 0.2788205960997061, 0.9705627484771391, 0.2788205960997061, 0.620499351813308, 0.0, 0.4222051018559565, 0.892443989449804, 0.41640786499873883, 0.0, 0.6491106406735181, 0.36931687685298087, 0.16552506059643868, 0.04159457879229578, 0.0, 0.04159457879229578, 0.16552506059643868, 0.36931687685298087, 0.6491106406735181, 0.0, 0.41640786499873883, 0.892443989449804, 0.4222051018559565, 0.0, 0.620499351813308, 0.2788205960997061, 0.9705627484771391]
items = sorted(items)
#print(len(items))
#print(items)

itemSet = sorted(list(set(items)))
#print(len(itemSet))
#print(itemSet)
#print(myList)

#Utility functions
#s is a set of indices
def mySum(s):
  return (sum([myList[i] for i in s]))%1

def mySum2(s):
  return (sum([myList[i] for i in s]) + x)%1


start = time.time()


for step in range(1, 1000000):

    myList = random.sample(items,  listLength)
    # HACK
    if(step<44):
        continue

    print("Step %s"%(step))
    myList = sorted(myList)

    listHalfIndices = [i for i in combinations(range(len(myList)),halfsize)]
    listHalfIndices1 = sorted(listHalfIndices, key = mySum)
    #print(listHalfIndices)
    #print([mySum(s) for s in listHalfIndices])

    listHalfIndices2 = [i for i in combinations(range(len(myList)),halfsize2)]
    listHalfIndices2 = sorted(listHalfIndices2, key = mySum2)
    #print(listHalfIndices2)
    #print([mySum2(s) for s in listHalfIndices2])
    """
    # SKIP THIS as I heuristically noted that it was pretty useless
    # First answer if the sum of the first and second list is smaller than epsilon
    print("ANSWER TYPE 1")
    #print([mySum(s) for s in listHalfIndices1[0:10]])
    #print([mySum2(s) for s in listHalfIndices2[0:10]])


    listLowIndices1 = [s for s in listHalfIndices1 if mySum(s) <= epsilon]
    #print(listLowIndices1)
    #print([mySum(s) for s in listLowIndices1])

    listLowIndices2 = [s for s in listHalfIndices2 if mySum2(s) <= epsilon]
    #print(listLowIndices2)
    #print([mySum2(s) for s in listLowIndices2])

    combinationOfIndices1 = [list(set(sum(i, ()))) for i in itertools.chain(itertools.product(listLowIndices1, listLowIndices2))]
    #print(combinationOfIndices1)
    #print([mySum2(s) for s in combinationOfIndices1])

    answer1 = [i for i in combinationOfIndices1 if len(i) == (2*halfsize) and mySum2(i)<=epsilon]
    if(len(answer1) > 0):
        print (answer1)
        print([mySum2(s) for s in answer1])
        break

    # Second answer if the sum of the first and second list is larger than 2-epsilon
    print("ANSWER TYPE 2")
    #print([mySum(s) for s in listHalfIndices1[-10:-1]])
    #print([mySum2(s) for s in listHalfIndices2[-10:-1]])

    listHighIndices1 = [s for s in listHalfIndices1 if mySum(s) >= 1-epsilon]
    #print(listHighIndices1)

    listHighIndices2 = [s for s in listHalfIndices2 if mySum2(s) >= 1-epsilon]
    #print(listHighIndices2)

    combinationOfIndices2 = [list(set(sum(i, ()))) for i in itertools.chain(itertools.product(listHighIndices1, listHighIndices2))]
    #print(combinationOfIndices2)
    #print([mySum2(s) for s in combinationOfIndices2])

    answer2 = [i for i in combinationOfIndices2 if len(i) == (2*halfsize) and mySum2(i)>=1-epsilon]
    if(len(answer2) > 0):
        print (answer2)
        print([mySum2(s) for s in answer2])
        break
    """
    # Third answer if the sum of the first and second list is between 1-epsilon and 1+epsilon
    #print("ANSWER TYPE 3")
    i = 0
    j = len(listHalfIndices2) - 1
    answer3 = None
    print("Answer of type 3 will explore at most %s combinations "%(2*len(listHalfIndices2)))
    for k in range(2*len(listHalfIndices2)): 
        if(k%1000000 == 0 and k>0):
                print(k)
        setOfIndices = list(set(listHalfIndices1[i] + listHalfIndices2[j]))
        #print(setOfIndices)

        currentSum = mySum(listHalfIndices1[i]) + mySum2(listHalfIndices2[j])
        #print(currentSum)
        if(currentSum < 1-epsilon):
            i = i+1
            if(i>=len(listHalfIndices1)):
                break
            #print("i++")
        else:
            j = j-1
            if(j<0):
              break
            #print("j--")
        if(len(setOfIndices) < (halfsize+halfsize2)):
            #print("skipping")
            continue

        if(currentSum >= 1-epsilon and currentSum <= 1+epsilon):
            print("Found smart combination with sum : s=%s"%(currentSum))
            print(sorted([myList[i] for i in setOfIndices]))
            answer3 = setOfIndices
            break

    if answer3 is not None:
        break

end = time.time()
print("Time elapsed %s"%(end-start))

Algorithm for finding subset out of list that fulfills constraint

Question

3 answers

solution1
4 2017-02-25 20:44:53

solution2
2 ACCPTED 2017-02-26 08:42:05

solution3
0 2017-02-25 23:02:47

Algorithm for finding subset out of list that fulfills constraint

Question

3 answers

solution1 4 2017-02-25 20:44:53

solution2 2 ACCPTED 2017-02-26 08:42:05

solution3 0 2017-02-25 23:02:47

solution1
4 2017-02-25 20:44:53

solution2
2 ACCPTED 2017-02-26 08:42:05

solution3
0 2017-02-25 23:02:47