简体   繁体   English

从给定值列表中查找固定长度子列表的最快方法,其值之和等于定义的数字

[英]Fastest way to find sub-lists of a fixed length, from a given list of values, whose elements sum equals a defined number

In Python 3.6, suppose that I have a list of numbers L , and that I want to find all possible sub-lists S of a given pre-chosen length |S| 在Python 3.6中,假设我有一个数字L的列表,并且我想找到给定预选长度|S|所有可能的子列表S , such that: ,这样:

  • any S has to have length smaller than L , that is |S| < |L| 任何S长度都必须小于L ,即|S| < |L| |S| < |L|
  • any S can only contain numbers present in L 任何S只能包含L存在的数字
  • numbers in S do not have to be unique (they can appear repeatedly) S数字不必唯一(它们可以重复出现)
  • the sum of all numbers in S should be equal to a pre-determined number N S中所有数字的总和应等于预定数字N

A trivial solution for this can be found using the Cartesian Product with itertools.product . 可以使用带有itertools.product的笛卡尔积找到一个简单的解决方案。 For example, suppose L is a simple list of all integers between 1 and 10 (inclusive) and |S| 例如,假设L是1到10(含)和|S|之间的所有整数的简单列表。 is chosen to be 3. Then: 选择为3。然后:

import itertools
L = range(1,11)
N = 8
Slength = 3
result = [list(seq) for seq in itertools.product(L, repeat=Slength) if sum(seq) == N]

However, as larger lists L are chosen, and or larger |S| 但是,当选择较大的列表L ,或选择较大的|S| , the above approach becomes extremely slow. ,上述方法变得极其缓慢。 In fact, even for L = range(1,101) with |S|=5 and N=80 , the computer almost freezes and it takes approximately an hour to compute the result. 实际上,即使对于|S|=5N=80 L = range(1,101) ,计算机也几乎死机,大约需要一个小时来计算结果。

My take is that: 我的看法是:

  • there is a lot of unnecessary computations going on there under the hood, given the condition that sub-lists should sum to N 考虑到子列表总和为N的条件,在幕后进行了很多不必要的计算
  • there is a ton of cache misses due to iterating over possibly millions of lists generated by itertools.product to just keep much much fewer 由于迭代itertools.product可能生成的数百万个列表而导致大量缓存丢失,从而使缓存数量大大减少

So, my question/challenge is: is there a way I can do this in a more computationally efficient way? 因此,我的问题/挑战是:有没有办法以一种更有效的计算方式来做到这一点? Unless we are talking hundreds of Gigabytes, speed to me is more critical than memory, so the challenge focuses more on speed, even if considerations for memory efficiency are a welcome bonus. 除非我们要谈论数百千兆字节,否则对我而言,速度比内存更关键,因此即使考虑内存效率是可取的奖励,但挑战更多地集中在速度上。

So given an input list and a target length and sum, you want all the permutations of the numbers in the input list such that: 因此,给定一个输入列表以及目标长度和总和,您希望输入列表中数字的所有排列如下:

  1. The sum equals the target sum 总和等于目标总和
  2. The length equals the target length 长度等于目标长度

The following code should be faster: 下面的代码应该更快:

# Input
input_list = range(1,101)

# Targets
target_sum = 15
target_length = 5

# Available numbers
numbers = set(input_list)

# Initialize the stack
stack = [[num] for num in numbers]

result = []

# Loop until we run out of permutations 
while stack:
    # Get a permutation from the stack
    current = stack.pop()

    # If it's too short
    if len(current) < target_length:
        # And the sum is too small
        if sum(current) < target_sum:
            # Then for each available number
            for num in numbers:
                # Append said number and put the resulting permutation back into the stack
                stack.append(current + [num])

    # If it's not too short and the sum equals the target, add to the result!
    elif sum(current) == target_sum:
        result.append(current)

print(len(result))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM