[英]How to find a set of values from a list of lists which contains no repetitions
You have a list of lists in Python, something like this: 您在Python中有一个列表列表,如下所示:
l = [[ 1, 2, 3],
[18, 20, 22],
[ 3, 14, 16],
[ 1, 3, 05],
[18, 2, 16]]
How would you go about selecting one value from each sub-list, such that no single value is repeated, and the sum of the resulting list is minimised? 您将如何从每个子列表中选择一个值,这样就不会重复单个值,并使结果列表的总和最小化?
result = [1, 18, 3, 5, 2]
Here's a compact brute-force solution, so it has to perform columns**rows
tests, which is not good. 这是一个紧凑的蛮力解决方案,因此它必须执行
columns**rows
测试,这不是很好。 I suspect that there's a backtracking algorithm that's generally more efficient, but in the worst case all possibilities may need to be checked. 我怀疑有一种回溯算法通常更有效,但是在最坏的情况下,可能需要检查所有可能性。
from itertools import product
lst = [
[ 1, 2, 3],
[18, 20, 22],
[ 3, 14, 16],
[ 1, 3, 5],
[18, 2, 16],
]
nrows = len(lst)
m = min((t for t in product(*lst) if len(set(t)) == nrows), key=sum)
print(m)
output 产量
(1, 18, 3, 5, 2)
Here's a faster version that uses a recursive generator instead of itertools.product
. 这是使用递归生成器而不是
itertools.product
的更快版本。
def select(data, seq):
if data:
for seq in select(data[:-1], seq):
for u in data[-1]:
if u not in seq:
yield seq + [u]
else:
yield seq
def solve(data):
return min(select(data, []), key=sum)
Here's a modified version of the recursive generator that sorts as it goes, but of course that's slower, and it consumes more RAM. 这是递归生成器的修改版本,可以按其进行排序,但是当然速度较慢,并且消耗更多的RAM。 If the input data is sorted it usually finds the minimum solution quite rapidly, but I can't figure out a foolproof way of getting it to stop when it's found the minimum selection.
如果对输入数据进行排序,它通常会很快找到最小解决方案,但是我无法找到一种万无一失的方法,可以在找到最小选择时停止它。
def select(data, selected):
if data:
for selected in sorted(select(data[:-1], selected), key=sum):
for u in data[-1]:
if u not in selected:
yield selected + [u]
else:
yield selected
Here's some timing code that compares the speed of Maurice's and my solutions. 这是一些时序代码,用于比较Maurice和我的解决方案的速度。 It runs on Python 2 and Python 3. I get similar time results on Python 2.6 & Python 3.6 on my old 2GHz 32 bit machine running an oldish Debian derivative of Linux.
它可以在Python 2和Python 3上运行。在旧版的运行GHz2的Debian派生Linux的32 GHz机器上,在2.6和2.6上获得的时间相似。
from __future__ import print_function, division
from timeit import Timer
from itertools import product
from random import seed, sample, randrange
n = randrange(0, 1 << 32)
print('seed', n)
seed(n)
def show(data):
indent = ' ' * 4
s = '\n'.join(['{0}{1},'.format(indent, row) for row in data])
print('[\n{0}\n]\n'.format(s))
def make_data(rows, cols):
maxn = rows * cols
nums = range(1, maxn)
return [sample(nums, cols) for _ in range(rows)]
def sort_data(data):
newdata = [sorted(row) for row in data]
newdata.sort(reverse=True, key=sum)
return newdata
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
def solve_Maurice(data):
result = None
for item in product(*data):
if len(item) > len(set(item)):
# Try the next combination if there are duplicates
continue
if result is None or sum(result) > sum(item):
result = item
return result
def solve_prodgen(data):
rows = len(data)
return min((t for t in product(*data) if len(set(t)) == rows), key=sum)
def select(data, seq):
if data:
for seq in select(data[:-1], seq):
for u in data[-1]:
if u not in seq:
yield seq + [u]
else:
yield seq
def solve_recgen(data):
return min(select(data, []), key=sum)
funcs = (
solve_Maurice,
solve_prodgen,
solve_recgen,
)
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
def verify():
for func in funcs:
fname = func.__name__
seq = func(data)
print('{0:14} {1}'.format(fname, seq))
print()
def time_test(loops, reps):
''' Print timing stats for all the functions '''
timings = []
for func in funcs:
fname = func.__name__
setup = 'from __main__ import data, ' + fname
cmd = fname + '(data)'
t = Timer(cmd, setup)
result = t.repeat(reps, loops)
result.sort()
timings.append((result, fname))
timings.sort()
for result, fname in timings:
print('{0:14} {1}'.format(fname, result))
rows, cols = 6, 4
print('Number of selections:', cols ** rows)
data = make_data(rows, cols)
data = sort_data(data)
show(data)
verify()
loops, reps = 100, 3
time_test(loops, reps)
typical output 典型输出
seed 22290
Number of selections: 4096
[
[6, 11, 22, 23],
[9, 14, 17, 19],
[5, 9, 16, 22],
[5, 6, 9, 13],
[1, 3, 6, 22],
[4, 5, 6, 13],
]
solve_Maurice (11, 9, 5, 6, 1, 4)
solve_prodgen (11, 9, 5, 6, 1, 4)
solve_recgen [11, 9, 5, 6, 1, 4]
solve_recgen [0.5476037560001714, 0.549133045002236, 0.5647858490046929]
solve_prodgen [1.2500368960027117, 1.296529343999282, 1.3022710209988873]
solve_Maurice [1.485518219997175, 1.489505891004228, 1.784105566002836]
EDIT: My previous solution only works in most cases, this should do the trick in all cases: 编辑:我以前的解决方案仅在大多数情况下有效,这应该在所有情况下都能解决问题:
from itertools import product
l = [[1, 2, 3], [18, 20, 22], [3, 14, 16], [1, 3, 5], [18, 2, 16]]
result = None
for item in product(*l):
if len(item) > len(set(item)):
# Try the next combination if there are duplicates
continue
if result is None or sum(result) > sum(item):
result = item
print(result)
Output 输出量
(1, 18, 3, 5, 2)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.