简体   繁体   中英

itertools.product: how to improve the performance?

I need to generate product of a list of itertool.permutation generator, and uses the following code:

def iter_version():
  l = [itertools.permutations(range(10)) for _ in range(10)]
  g = itertools.product(*l)
  for i in g:
    yield i

But this code is WAY TOO slow. It takes 16 seconds on my desktop. cProfile shows nothing except telling me this function takes 16 seconds.

If I just create some insane for loop like this:

def for_loop():
  l = [itertools.permutations(range(10)) for _ in range(10)]
  for i0 in l[0]:
    for i1 in l[1]:
      for i2 in l[2]:
        for i3 in l[3]:
          for i4 in l[4]:
            for i5 in l[5]:
              for i6 in l[6]:
                for i7 in l[7]:
                  for i8 in l[8]:
                    for i9 in l[9]:
                      yield (i0, i1, i2, i3, i4, i5, i6, i7, i8, i9)

This runs almost instantly.

In my situation, the list of permutation generators is not fixed size, so I cannot use the for loop version.

Like @DSM's answer said, itertools.product will convert the iterable to a concrete sequence. This can be confirmed from http://bugs.python.org/issue10109

To solve this problem without converting iterable to list, I used this function instead. Note this function uses recursion so test before use.

def product(*args):
    if len(args) == 1:
        for i in args[0]:
            yield [i]
    else:
        for i in args[0]:
            for j in product(*args[1:]):
                j.append(i)
                yield j

Respectfully, I don't believe your first code takes 16s to run. There are (3628800)^10, or 395940866122425193243875570782668457763038822400000000000000000000, elements to be yielded. I could imagine it taking 16s on some system to compute the 3628800*10 = 36288000 permutations, though. (Since you don't show how you're calling iter_version , you might be only after next(iter_version()) or something, I guess, although if so there are much simpler ways to get it..)

The real difference between iter_version and for_loop is that itertools.product doesn't materialize the Cartesian product, but it does convert each of the arguments to a list first, and lists can be iterated over repeatedly. In for_loop , you're exhausting your iterators, and so you're not doing nearly as much work.

It's probably easier to see with a smaller case, say (2,2) instead of (10,10):

>>> list(iter_version())
[((0, 1), (0, 1)), ((0, 1), (1, 0)), ((1, 0), (0, 1)), ((1, 0), (1, 0))]
>>> list(for_loop())
[((0, 1), (0, 1)), ((0, 1), (1, 0))]

If you add list around the itertools.permutations call, though, they become equivalent again:

>>> list(for_loop_materialized_list())
[((0, 1), (0, 1)), ((0, 1), (1, 0)), ((1, 0), (0, 1)), ((1, 0), (1, 0))]

If you really want the results of iter_version , I suggest you start wanting something else instead. :-)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM