简体   繁体   English

使用列表列表查找加起来等于给定数字 python 的所有组合

[英]Find all combinations that add up to given number python with list of lists

I've seen plenty of threads on how to find all combinations that add up to a number with one list, but wanted to know how to expand this such that you can only pick one number at a time, from a list of lists我已经看到很多关于如何通过一个列表找到所有加起来等于一个数字的组合的线程,但想知道如何扩展它以便您一次只能从列表列表中选择一个数字

Question:问题:
You must select 1 number from each list, how do you find all combinations that sum to N?你必须 select 每个列表中的 1 个数字,你如何找到总和为 N 的所有组合?

Given:鉴于:
3 lists of differing fixed lengths [eg l1 will always have 6 values, l2 will always have 10 values, etc]: 3 个不同固定长度的列表 [例如 l1 总是有 6 个值,l2 总是有 10 个值,等等]:

l1 = [0.013,0.014,0.015,0.016,0.017,0.018]
l2 = [0.0396,0.0408,0.042,0.0432,0.0444,0.045,0.0468,0.048,0.0492,0.0504]
l3 = [0.0396,0.0408]

Desired Output:所需 Output:
If N =.0954 then the output is [0.015, 0.396, 0.408],[0.015, 0.408, 0.0396].如果 N =.0954,则 output 为 [0.015、0.396、0.408]、[0.015、0.408、0.0396]。

What I have tried:我试过的:

output = sum(list(product(l1,l2,l3,l4,l5,l6,l7,l8)))

However this is too intensive as my largest bucket has 34 values, creating too many combinations.然而,这太密集了,因为我最大的桶有 34 个值,创建了太多的组合。

Any help/tips on how to approach this in a more efficient manner would be greatly appreciated!非常感谢任何有关如何以更有效的方式处理此问题的帮助/提示!

My solution我的解决方案

So my attempt with Branch&Bound所以我对 Branch&Bound 的尝试


def bb(target):
    L=[l1,l2,l3,l4,l5,l6,l7,l8]
    mn=[min(l) for l in L]
    mx=[max(l) for l in L]
    return bbrec([], target, L, mn, mx)
    
eps=1e-9

def bbrec(sofar, target, L, mn, mx):
    if len(L)==0:
        if target<eps and target>-eps: return [sofar]
        return []
    if sum(mn)>target+eps: return []
    if sum(mx)<target-eps: return []
    res=[]
    for x in L[0]:
        res += bbrec(sofar+[x], target-x, L[1:], mn[1:], mx[1:])
    return res

Note that it is clearly not optimized.请注意,它显然没有优化。 For example, it might be faster, to avoid list appending, to deal with 8 elements list from the start (for example, for sofar , filled with None slots at the beginning).例如,为了避免列表追加,从一开始就处理 8 个元素的列表(例如,对于sofar ,在开头填充None插槽)可能会更快。 Or to create an iterator (yielding results when we find some, rather than appending them.或者创建一个迭代器(当我们找到一些结果时产生结果,而不是追加它们。

But as is, it is already 40 times faster than brute force method on my generated data (giving the exact same result).但事实上,它已经比我生成的数据的蛮力方法快 40 倍(给出完全相同的结果)。 Which is something, considering that this is pure python, when brute force can use by beloved itertools (that is python also, of course, but iterations are done faster, since they are done in implementation of itertools, not in python code).考虑到这是纯粹的 python,当心爱的 itertools 可以使用蛮力时(当然,也就是 python,但是迭代完成得更快,因为它们是在 itertools 的实现中完成的,而不是在 python 代码中完成的)。

And I must confess brute force was faster than expected.而且我必须承认蛮力比预期的要快。 But, yet, still 40 times too slow.但是,仍然慢了 40 倍。

Explanation解释

General principle of branch and bound is to enumerate all possible solution recursively (reasoning being "there are len(l1) sort of solutions: those containing l1[0], those containing l1[1], ...; and among the first category, there are len(l2) sort of solutions, ...").分支定界法的一般原则是递归地枚举所有可能的解决方案(推理是“有 len(l1) 种解决方案:那些包含 l1[0],那些包含 l1[1],...;并且在第一类中,有 len(l2) 种解决方案,...”)。 Which, so far, is just another implementation of brute force.到目前为止,这只是蛮力的另一种实施。 Except that during recursion, you can't cut whole branches, (whole subset of all candidates) if you know that finding a solution is impossible from where you are.除了在递归期间,如果您知道从您所在的位置不可能找到解决方案,则您不能切断整个分支(所有候选人的整个子集)。

It is probably clearer with an example, so let's use yours.举个例子可能更清楚,所以让我们使用你的例子。

bbrec is called with bbrec被称为

  • a partial solution (starting with an empty list [] , and ending with a list of 8 numbers)部分解决方案(以空列表[]开始,以 8 个数字的列表结束)
  • a target for the sum of remaining numbers剩余数字总和的目标
  • a list of list from which we must take numbers (so at the beginning, your 8 lists. Once we have chosen the 1st number, the 7 remaining lists. Etc)我们必须从中获取数字的列表列表(所以在开始时,你的 8 个列表。一旦我们选择了第一个数字,剩下的 7 个列表等等)
  • a list of minimum values of those lists (8 numbers at first, being the 8 minimum values)这些列表的最小值列表(最初为 8 个数字,即 8 个最小值)
  • a list of maximum values最大值列表

It is called at first with ([], target, [l1,...,l8], [min(l1),...,min(l8)], [max(l1),...,max(l8)])它首先用([], target, [l1,...,l8], [min(l1),...,min(l8)], [max(l1),...,max(l8)])

And each call is supposed to choose a number from the first list, and call bbrec recursively to choose the remaining numbers.并且每次调用都应该从第一个列表中选择一个数字,然后递归调用bbrec以选择剩余的数字。

The eigth recursive call with be done with sofar a list of 8 numbers (a solution, or candidate).第八次递归调用是用目前sofar的 8 个数字列表(一个解决方案或候选方案)完成的。 target being what we have to find in the rest. And since there is no rest, it should be 0. L , mn , and mx an empty list. target 是我们必须在 rest 中找到的内容。由于没有 rest,因此它应该是Lmnmx是一个空列表。 So When we see that we are in this situation (that is len(L)=len(mn)=len(mx)=0 or len(sofar)=8 — any of those 4 criteria are equivalents), we just have to check if the remaining target is 0. If so, then sofar is a solution.所以当我们看到我们处于这种情况时(即len(L)=len(mn)=len(mx)=0len(sofar)=8这 4 个标准中的任何一个都是等价的),我们只需要检查剩余目标是否为 0。如果是,则sofar是一个解决方案。 If not, then sofar is not a solution.如果不是,那么sofar不是解决方案。

If we are not in this situation.如果我们不是这种情况。 That is, if there are still numbers to choose for sofar.也就是说,如果到目前为止还有数字可供选择。 bbrec just choose the first number, by iterating all possibilites from the first list. bbrec通过迭代第一个列表中的所有可能性来选择第一个数字。 And, for each of those, call itself recursively to choose remaining numbers.并且,对于其中的每一个,递归地调用自己来选择剩余的数字。

But before doing so (and those are the 2 lines that make B&B useful. Otherwise it is just a recursive implementation of the enumeration of all 8-uples for 8 lists), we check if there is at least a chance to find a solution there.但在这样做之前(这些是使 B&B 有用的 2 行。否则它只是对 8 个列表的所有 8 元枚举的递归实现),我们检查是否至少有机会在那里找到解决方案.

For example, if you are calling bbrec([1,2,3,4], 12, [[1,2,3],[1,2,3], [5,6,7], [8,9,10]], [1,1,5,8], [3,3,7,10]) (note that mn and mx are redundant information. They are just min and max of the lists. But no need to compute those min and max over and over again)例如,如果您调用bbrec([1,2,3,4], 12, [[1,2,3],[1,2,3], [5,6,7], [8,9,10]], [1,1,5,8], [3,3,7,10]) (注意mnmx是冗余信息。它们只是列表的最小值和最大值。但不需要计算那些最小值和最大值一遍又一遍)

So, if you are calling bbrec like this, that means that you have already chosen 4 numbers, from the 4 first lists.因此,如果您这样调用bbrec ,则意味着您已经从前 4 个列表中选择了 4 个号码。 And you need to choose 4 other numbers, from the 4 remaining list that are passed as the 3rd argument.您需要从作为第三个参数传递的剩余 4 个列表中选择 4 个其他数字。

And the total of the 4 numbers you still have to choose must be 12.并且您仍然必须选择的 4 个数字的总数必须是 12。

But, you also know that any combination of 4 numbers from the 4 remaining list will sum to a total between 1+1+5+8=15 and 3+3+7+10=23.但是,您还知道剩余 4 个列表中的 4 个数字的任意组合的总和将在 1+1+5+8=15 和 3+3+7+10=23 之间。

So, no need to even bother enumerating all the solutions starting with [1,2,3,4] and continuing with 4 numbers chosen from [1,2,3],[1,2,3], [5,6,7], [8,9,10] .因此,无需费心枚举所有以[1,2,3,4]开头并继续从[1,2,3],[1,2,3], [5,6,7], [8,9,10] It is a lost cause: none of the remaining 4 numbers with result in a total of 12 anyway (they all will have a total of at least 15).这是一个失败的原因:无论如何,剩下的 4 个数字中没有一个结果总数为 12(它们的总数至少为 15)。

And that is what explain why this algorithm can beat, with a factor 40, an itertools based solution, by using only naive manipulation of lists, and for loops.这就是为什么该算法可以仅使用简单的列表操作和 for 循环以 40 倍击败基于 itertools 的解决方案的原因。

Brute force solution蛮力解决方案

If you want to compare yourself on your example, the brute force solution (already given in comments)如果您想将自己与您的示例进行比较,蛮力解决方案(已在评论中给出)

def brute(target):
    return [k for k in itertools.product(l1,l2,l3,l4,l5,l6,l7,l8) if math.isclose(sum(k), target)]

Generator version发电机版

Not really faster.不是真的更快。 But at least, if the idea is not to build a list of all solutions, but to iterate through them, that version allows to do so (and it is very slightly faster).但至少,如果这个想法不是构建所有解决方案的列表,而是迭代它们,那么该版本允许这样做(而且速度稍快)。 And since we talked about generator vs lists in comments...自从我们在评论中讨论了生成器与列表...

eps=1e-9
def bb(target):
    L=[l1,l2,l3,l4,l5,l6,l7,l8]
    mn=[min(l) for l in L]
    mx=[max(l) for l in L]
    return list(bbit([], target, L, mn, mx))
def bbit(sofar, target, L, mn, mx):
    if len(L)==0:
        if target<eps and target>-eps:
            print(sofar)
            yield sofar
        return
    if sum(mn)>target+eps: return
    if sum(mx)<target-eps: return
    for x in L[0]:
        yield from bbrec(sofar+[x], target-x, L[1:], mn[1:], mx[1:])

Here, I use it just to build a list (so, no advantage from the first version).在这里,我只是用它来构建一个列表(因此,与第一个版本相比没有任何优势)。

But if you wanted to just print solutions, for example, you could但是如果你只想打印解决方案,例如,你可以

for sol in bbit([], target, L, mn, mx):
   print(sol)

Which would print all solutions, without building any list of solutions.这将打印所有解决方案,而不构建任何解决方案列表。

Example lists示例列表

Just for btilly or those who would like to test their method against the same lists I've used, here are the ones I've chosen只为 btilly 或那些想根据我使用过的相同列表测试他们的方法的人,这里是我选择的那些

l1=list(np.arange(0.013, 0.019, 0.001))
l2=list(np.arange(0.0396, 0.0516, 0.0012))
l3=[0.0396, 0.0498]
l4=list(np.arange(0.02, 0.8, 0.02))
l5=list(np.arange(0.001, 0.020, 0.001))
l6=list(np.arange(0.021, 0.035, 0.001))
l7=list(np.arange(0.058, 0.088, 0.002))
l8=list(np.arange(0.020, 0.040, 0.005))

Non-recursive solution:非递归解决方案:

from itertools import accumulate, product
from sys import float_info

def test(lists, target):
    # will return a list of 2-tuples, containing sum and elements making it
    convolutions = [(0,())]
    # lower_bounds[i] - what is the least gain we'll get from remaining lists
    lower_bounds = list(accumulate(map(min, lists[::-1])))[::-1][1:] + [0]
    # upper_bounds[i] - what is the max gain we'll get from remaining lists
    upper_bounds = list(accumulate(map(max, lists[::-1])))[::-1][1:] + [0]
    for l, lower_bound, upper_bound in zip(lists, lower_bounds, upper_bounds):
        convolutions = [
            # update sum and extend the list for viable candidates
            (accumulated + new_element, elements + (new_element,))
            for (accumulated, elements), new_element in product(convolutions, l)
            if lower_bound - float_info.epsilon <= target - accumulated - new_element <= upper_bound +  float_info.epsilon
        ]

    return convolutions

Output of test(lists, target) : Output test(lists, target)

[(0.09540000000000001, (0.015, 0.0396, 0.0408)),
 (0.09540000000000001, (0.015, 0.0408, 0.0396))]

This can be further optimized by sorting lists and slicing them based on upper/lower bound using bisect :这可以通过对列表进行排序并使用bisect根据上限/下限对它们进行切片来进一步优化:

from bisect import bisect_left, bisect_right
# ...

convolutions = [
    (partial_sum + new_element, partial_elements + (new_element,))
    for partial_sum, partial_elements in convolutions
    for new_element in l[bisect_left(l, target-upper_bound-partial_sum-float_info.epsilon):bisect_right(l, target-lower_bound-partial_sum+float_info.epsilon)]
]

And here is a straightforward dynamic programming solution.这是一个简单的动态规划解决方案。 I build a data structure which has the answer, and then generate the answer from that data structure.我构建了一个包含答案的数据结构,然后从该数据结构生成答案。

from dataclasses import dataclass
from decimal import Decimal
from typing import Any

@dataclass
class SummationNode:
    value: Decimal
    solution_tail: Any = None
    next_solution: Any = None

    def solutions (self):
        if self.value is None:
            yield []
        else:
            for rest in self.solution_tail.solutions():
                rest.append(self.value)
                yield rest

        if self.next_solution is not None:
            yield from self.next_solution.solutions()


def all_combinations(target, *lists):
    solution_by_total = {
        Decimal(0): SummationNode(None)
    }

    for l in lists:
        old_solution_by_total = solution_by_total
        solution_by_total = {}
        for x_raw in l:
            x = Decimal(str(x_raw)) # Deal with rounding.
            for prev_total, prev_solution in old_solution_by_total.items():
                next_solution = solution_by_total.get(x + prev_total)
                solution_by_total[x + prev_total] = SummationNode(
                    x, prev_solution, next_solution
                    )
    return solution_by_total.get(Decimal(str(target)))

l1 = [0.013,0.014,0.015,0.016,0.017,0.018]
l2 = [0.0396,0.0408,0.042,0.0432,0.0444,0.045,0.0468,0.048,0.0492,0.0504]
l3 = [0.0396,0.0408]
for answer in all_combinations(0.0964, l1, l2, l3).solutions():
    print(answer)

To check that the logic of this matches the others, when rounding errors are fixed, use the following test:要检查此逻辑是否与其他逻辑匹配,当舍入错误已修复时,请使用以下测试:

import numpy as np

def arange(start, stop, step):
    return [round(x, 5) for x in list(np.arange(start, stop, step))]

l1=arange(0.013, 0.019, 0.001)
l2=arange(0.0396, 0.0516, 0.0012)
l3=[0.0396, 0.0498]
l4=arange(0.02, 0.8, 0.02)
l5=arange(0.001, 0.020, 0.001)
l6=arange(0.021, 0.035, 0.001)
l7=arange(0.058, 0.088, 0.002)
l8=arange(0.020, 0.040, 0.005)

for answer in all_combinations(0.2716, l1, l2, l3, l4, l5, l6, l7, l8).solutions():
    print([float(x) for x in answer])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM