So far I have mylist = list(itertools.product(*a))
The problem with this is that it makes too many tuples. I want it not to make the tuple if the sum of all of tuple is > 4. eg
[(0, 0, 0, 0),
(0, 0, 0, 1),
(0, 0, 0, 2),
(0, 0, 1, 0),
(0, 0, 1, 1),
(0, 0, 1, 2),
(0, 1, 0, 0),
(0, 1, 0, 1),
(0, 1, 0, 2),
(0, 1, 1, 0),
(0, 1, 1, 1),
(0, 1, 1, 2),
(1, 0, 0, 0),
(1, 0, 0, 1),
(1, 0, 0, 2),
(1, 0, 1, 0),
(1, 0, 1, 1),
(1, 0, 1, 2),
(1, 1, 0, 0),
(1, 1, 0, 1),
(1, 1, 0, 2),
(1, 1, 1, 0),
(1, 1, 1, 1),
(1, 1, 1, 2)]
It shouldn't make (1, 1, 1, 2)
as it sums to 5
; while in this example it's only one, in others it will be considerably more.
If your dataset is large, you could probably use numpy here.
numpy.indices
provides an equivalent of itertools.product
that you can also filter efficiently,
import numpy as np
arr = np.indices((4, 4, 4, 4)).reshape(4,-1).T
mask = arr.sum(axis=1) < 5
res = arr[mask]
print(res)
#[[0 0 0 0]
# [0 0 0 1]
# [0 0 0 2]
# [0 0 0 3]
# [0 0 1 0]
# ...
# [3 0 0 1]
# [3 0 1 0]
# [3 1 0 0]]
Otherwise for small datasets, as mentioned in the comments, itertools.ifilter
is pretty fast,
from itertools import product, ifilter
gen = product((0,1,2,3), repeat=4)
res = ifilter(lambda x: sum(x) < 4, gen)
res = list(res) # converting to list only at the end
In this particular case, both approaches give comparable performance.
If you need even better performance for this specific case, you can always write your optimized routine in C or Cython.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.