简体   繁体   English

查找求和为一个值的最小数量的元素(Python)

[英]Finding the smallest number of elements that sum to a value (Python)

I have a set of positive integers 我有一组正整数

values = [15, 23, 6, 14, 16, 24, 7]

which can be chosen with replacement to sum to a number between 0 and 24 (inclusive), where the fewer values used, the better. 可以选择替换它,以求和为0到24(含)之间的数字,其中使用的值越少越好。

For example, 16 + 16 (mod 25) = 32 (mod 25) = 7 but 7 (mod 25) = 7 uses fewer additions and is therefore preferred. 例如,16 + 16(mod 25)= 32(mod 25)= 7但7(mod 25)= 7使用较少的加法运算,因此是首选。

My current approach is sequential increasingly nested for loops to generate all possible answers up to a point, and then finding the smallest number of values required by eye. 我当前的方法是顺序地嵌套嵌套循环,以生成所有可能的答案,直到一个点,然后找到眼睛所需的最小数量的值。 I use quicksort as a separate function to avoid repeated answers. 我将quicksort用作单独的功能,以避免重复的答案。

answers = []
for i in values:
    if i % 25 == n:
        if [i] not in answers:
            answers.append([i])
if not answers:
    for i in values:
        for j in values:
            if (i + j) % 25 == n:
                check = quicksort([i, j])
                if check not in answers:
                    answers.append(check)
if not answers:
    for i in values:
        for j in values:
            for k in values:
                if (i + j + k) % 25 == n:
                    check = quicksort([i, j, k])
                    if check not in answers:                            
                        answers.append(check)
for i in answers:
    print(i)

A typical output is then 然后是一个典型的输出

[14, 14]

from which I can see that [14, 14] is the most efficient sum. 从中我可以看到[14,14]是最有效的总和。

I know from brute forcing that at most four values are required to sum to all possible choices for n, but this seems like a very tedious way of finding the most efficient sum. 我从蛮力中知道,最多需要四个值才能对n的所有可能选择求和,但这似乎是找到最有效总和的一种非常乏味的方法。 Is there a more elegant algorithm? 有没有更优雅的算法?

EDIT: extra examples. 编辑:额外的示例。

If we choose n = 13, the code spits out 如果我们选择n = 13,则代码将弹出

[15, 23]
[6, 7]
[14, 24]

and choosing n = 18 outputs 并选择n = 18个输出

[14, 15, 15]
[6, 15, 23]
[23, 23, 23]
[7, 14, 23]
[6, 6, 7]
[6, 14, 24]
[14, 14, 16]

To clarify, the code works; 需要澄清的是,该代码有效。 it just seems messy and unnecessarily thorough. 看起来很混乱,而且不必要彻底。

The key is to use combinations_with_replacement() from the built-in itertools library. 关键是要使用内置itertools库中的itertools combinations_with_replacement() You can use that for any number of "combinations" you choose. 您可以将其用于任意数量的“组合”。 Here is my code, which prints your examples and is somewhat more general. 这是我的代码,它打印您的示例,并且更为通用。 (Note that your last example has the target 19 , but you mis-typed that as 18 .) (请注意,您的最后一个示例的目标是19 ,但您将其错误键入为18

from itertools import combinations_with_replacement

def print_modulo_sums(values, target, modulus, maxsize):
    """Print all multisets (sets with possible repetitions) of minimum
    cardinality from the given values that sum to the target,
    modulo the given modulus. If no such multiset with cardinality
    less than or equal the given max size exists, print nothing.
    """
    print("\nTarget = ", target)
    found_one = False
    for thissize in range(1, maxsize + 1):
        for multiset in combinations_with_replacement(values, thissize):
            if sum(multiset) % modulus == target:
                print(sorted(multiset))
                found_one = True
        if found_one:
            return

values = [15, 23, 6, 14, 16, 24, 7]

print_modulo_sums(values, 7, 25, 5)
print_modulo_sums(values, 3, 25, 5)
print_modulo_sums(values, 13, 25, 5)
print_modulo_sums(values, 19, 25, 5)

The printout is: 打印输出为:

Target =  7
[7]

Target =  3
[14, 14]

Target =  13
[15, 23]
[6, 7]
[14, 24]

Target =  19
[14, 15, 15]
[6, 15, 23]
[23, 23, 23]
[7, 14, 23]
[6, 6, 7]
[6, 14, 24]
[14, 14, 16]

Adding a simple loop at the end confirms that, for your given set of values and given modulus, at multiset with at most 4 members will sum to any given value from 0 through 24 . 在最后添加一个简单的循环可以确认,对于给定的一组值和给定的模数,最多包含4成员的多集将求和为024任何给定值。 The values 0 and 8 are the only ones requiring four: all others require at most three. 08是唯一需要四个的值:所有其他值最多需要三个。

First, you can express the whole thing as a nice recurrent procedure 首先,您可以将整个过程表示为一个不错的循环过程

def checker1(values, n, length):
  if length == 0:
    return False

  for value in values:
    if length == 1 and value % 25 == n:
      return [value]
    else:
      recurrent_call = checker(values, (n - value) % 25, length - 1)
      if recurrent_call:
        return [value] + recurrent_call

  return False 

It has exactly the same complexity as before, but now it is generic, and you just run it in a loop with max_length going from 1 up. 它具有与以前完全相同的复杂性,但是现在它是通用的,您只需在max_length从1开始的循环中运行它即可。 Now complexity-wise, you can use dynamic programming, notice that once you went through all pairs, you can go through triples faster, by just iterating once over entire list and checking if you have proper sums cached earlier. 现在,从复杂度角度讲,您可以使用动态编程,请注意,一旦遍历所有对,只需遍历整个列表一次并检查是否有较早的缓存就可以更快地完成三重遍历。 Lets do it. 我们开始做吧。

def checker2(values, n, max_length):

  _cache = {value: [value] for value in values}

  for length in range(2, max_length+1):

    for value in values:
      for value_in_cache in _cache.keys():
        value_mod = (value_in_cache + value) % 25         
        if value_mod not in _cache:
           _cache[value_mod] = _cache[value_in_cache] + [value]

    if n in _cache:
      return _cache[n]

  return False

This reduces computational complexity by over one order of magnitude, as we never re-calculate what we already know. 由于我们从不重新计算我们已经知道的内容,因此可以将计算复杂度降低一个数量级。 Expected complexity (assuming that reading from dictionary in python is O(1) is now: 预期的复杂度(假设现在从python中的字典中读取的是O(1):

O(max_length * len(values))

while before it was polynomial in len(values)! 而在它是len(values)多项式之前! We saved a lot by making inner loop over keys of _cache, which cannot have more than 25 values, thus - it is a constant complexity loop! 通过对_cache的键进行内部循环,我们可以节省很多时间,该循环不能有超过25个值,因此-这是一个恒定的复杂性循环! And since max_length cannot be bigger than len(values) (easy to prove) the total complexity cannot grow beyong O(len(values)^2), even if you have some really complex set of values. 而且由于max_length不能大于len(values)(易于证明),所以总复杂度不会随O(len(values)^ 2)的增加而增加,即使您有一些非常复杂的值集也是如此。

And a quick test: 快速测试:

print(checker2(values, 23, 5))       # [23]
print(checker2(values, 13, 5))       # [23, 15]
print(checker2(values, 19, 5))       # [6, 15, 23] 

These approaches assume you only care about shortest solution, and not all solutions. 这些方法假定您只关心最短的解决方案,而不是所有解决方案。 If you care about all solutions you can still go this way, by storing lists of values "caching" to the same bucket and then returning all the combinations etc. but then you do not save much computations. 如果您关心所有解决方案,那么仍然可以通过将“缓存”值列表存储到同一存储桶中,然后返回所有组合等来进行操作,但是这样就不会节省太多计算。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM