簡體   English   中英

提高多個列表的笛卡爾積的性能

[英]Improving the performance of cartesian product of multiple lists

我正在使用遞歸實現 python 中多個集合的笛卡爾積。

這是我的實現:

def car_two_sets(a, b):
    result = []
    for x in a:
        for y in b:
            result.append(str(x) + str(y))
    return result


def car_multiple_sets(lists):
    if len(lists) == 2:
        return car_two_sets(lists[0], lists[1])
    else:
        return car_multiple_sets([car_two_sets(lists[0], lists[1])] + lists[2:])


a = [1, 2]
b = [3, 4]
c = [6, 7, 8]
lists = [a, b, c]
print(car_multiple_sets(lists))

該代碼工作正常,但對於較大數量的集合,速度很慢。 關於如何改進此實施的任何想法? 我想到了 memoization,但找不到任何重復的計算來緩存。

我不想使用 itertools 函數。

具有三倍以上列表的基准:

 221 us   223 us   223 us  h
 225 us   227 us   227 us  k3
 228 us   229 us   229 us  k2
 267 us   267 us   267 us  k
 340 us   341 us   342 us  g
1177 us  1185 us  1194 us  car_multiple_sets
3057 us  3082 us  3084 us  f

代碼( 在線嘗試! ):

from timeit import repeat
from random import shuffle
from bisect import insort
from itertools import product, starmap
from operator import concat

def car_two_sets(a, b):
    result = []
    for x in a:
        for y in b:
            result.append(str(x) + str(y))
    return result


def car_multiple_sets(lists):
    if len(lists) == 2:
        return car_two_sets(lists[0], lists[1])
    else:
        return car_multiple_sets([car_two_sets(lists[0], lists[1])] + lists[2:])

def f(lists):
    return [''.join(map(str,a)) for a in product(*lists)]

def g(lists):
    return [''.join(a) for a in product(*[map(str,a)for a in lists])]

def h(lists):
    return list(map(''.join, product(*[map(str,a)for a in lists])))

def k(lists):
    result = ['']
    for lst in lists:
        lst = [*map(str, lst)]
        result = [S + s for S in result for s in lst]
    return result

def k2(lists):
    result = ['']
    for lst in lists:
        result = list(starmap(concat, product(result, map(str, lst))))
    return result

def k3(lists):
    result = ['']
    for lst in lists:
        result = starmap(concat, product(result, map(str, lst)))
    return list(result)

funcs = [car_multiple_sets, f, g, h, k, k2, k3]

a = [1, 2]
b = [3, 4]
c = [6, 7, 8]
lists = [a, b, c]

for func in funcs:
  print(func(lists), func.__name__)

times = {func: [] for func in funcs}
lists *= 3
for _ in range(50):
  shuffle(funcs)
  for func in funcs:
    t = min(repeat(lambda: func(lists), number=1))
    insort(times[func], t)
for func in sorted(funcs, key=times.get):
    print(*('%4d us ' % (t * 1e6) for t in times[func][:3]), func.__name__)

fg來自當前刪除的答案, k函數來自我)

幾點意見:

  • 如果您考慮一下, car_multiple_sets所做的就是迭代其參數lists 您正在使用遞歸來執行此操作,但也可以使用for循環對列表進行迭代。 碰巧在 python 中遞歸有點慢且內存效率低下,因此for循環更可取。

  • 您無需轉換為str即可將整數組合在一起。 您可以使用tuples 這正是他們的目的。 str(x)+str(y)替換為(x,y)以獲得一對兩個整數而不是字符串。

def car_two_sets(a, b, unpack=False):
    if unpack:
        return [(*x, y) for x in a for y in b]
    else:
        return [(x, y) for x in a for y in b]

def car_multiple_sets(lists):
    if len(lists) == 0:
        return [()]   # empty Cartesian product has one element, the empty tuple
    elif len(lists) == 1:
        return list(zip(lists[0]))   # list of "1-uples" for homogeneity
    else:
        result = car_two_sets(lists[0], lists[1])
        for l in lists[2:]:
            result = car_two_sets(result, l, unpack=True)
        return result

print( car_multiple_sets((range(3), 'abc', range(2))) )
# [(0, 'a', 0), (0, 'a', 1), (0, 'b', 0), (0, 'b', 1), (0, 'c', 0), (0, 'c', 1),
#  (1, 'a', 0), (1, 'a', 1), (1, 'b', 0), (1, 'b', 1), (1, 'c', 0), (1, 'c', 1),
#  (2, 'a', 0), (2, 'a', 1), (2, 'b', 0), (2, 'b', 1), (2, 'c', 0), (2, 'c', 1)]

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM