简体   繁体   中英

Python list comprehension for loops

I'm reading the Python wikibook and feel confused about this part:

List comprehension supports more than one for statement. It will evaluate the items in all of the objects sequentially and will loop over the shorter objects if one object is longer than the rest.

 >>>item = [x+y for x in 'cat' for y in 'pot'] >>>print item ['cp', 'co', 'ct', 'ap', 'ao', 'at', 'tp', 'to', 'tt'] 

I understand the usage of nested for loops but I don't get

...and will loop over the shorter objects if one object is longer than the rest

What does this mean? (shorter, longer...)

These type of nested loops create a Cartesian Product of the two sequences. Try it:

>>> [x+y for x in 'cat' for y in 'potty']
['cp', 'co', 'ct', 'ct', 'cy', 'ap', 'ao', 'at', 'at', 'ay', 'tp', 'to', 'tt', 'tt', 'ty']
>>> [x+y for x in 'catty' for y in 'pot']
['cp', 'co', 'ct', 'ap', 'ao', 'at', 'tp', 'to', 'tt', 'tp', 'to', 'tt', 'yp', 'yo', 'yt']

The inner 'x' in the list comprehension above (ie, the for x in 'cat' part) the is the same as the outer for x in 'cat': in this example:

>>> li=[]
>>> for x in 'cat':
...    for y in 'pot':
...       li.append(x+y)
# li=['cp', 'co', 'ct', 'ap', 'ao', 'at', 'tp', 'to', 'tt']

So the effect of making one shorter or longer is the same as making the 'x' or 'y' loop longer in two nested loops:

>>> li=[]
>>> for x in 'catty':
...    for y in 'pot':
...       li.append(x+y)
... 
>>> li==[x+y for x in 'catty' for y in 'pot']
True

In each case, the shorter sequence is looped over again until the longer sequence is exhausted. This unlike zip where the pairing would be terminated at the end of the shorter sequence.

Edit

There seems to be confusion (in the comments) about nested loops versus zip.

Nested Loops:

As shown above, this:

[x+y for x in '12345' for y in 'abc']

is the same as two nested 'for' loops with 'x' the outer loop.

Nested loops will execute the inner y loop the range of x in the outer loop times.

So:

>>> [x+y for x in '12345' for y in 'ab']
    ['1a', '1b',   # '1' in the x loop
     '2a', '2b',   # '2' in the x loop, b in the y loop
     '3a', '3b',   # '3' in the x loop, back to 'a' in the y loop
     '4a', '4b',   # so on
     '5a', '5b'] 

You can get the same result with product from itertools:

>>> from itertools import product
>>> [x+y for x,y in product('12345','ab')]
['1a', '1b', '2a', '2b', '3a', '3b', '4a', '4b', '5a', '5b']

Zip is similar but stops after the shorter sequence is exhausted:

>>> [x+y for x,y in zip('12345','ab')]
['1a', '2b']
>>> [x+y for x,y in zip('ab', '12345')]
['a1', 'b2']

You can use itertools for a zip that will zip until the longest sequence is exhausted, but the result is different:

>>> import itertools
>>> [x+y for x,y in itertools.zip_longest('12345','ab',fillvalue='*')]
['1a', '2b', '3*', '4*', '5*'] 

Well, the Python documentation does not talk of any such short/long case: http://docs.python.org/2/tutorial/datastructures.html#list-comprehensions . Having two "for" in a list comprehension means having two loops. The example pointed by @drewk is correct.

Let me copy it for the sake of explanation:

>>> [x+y for x in '123' for y in 'pot']
['1p', '1o', '1t', '2p', '2o', '2t', '3p', '3o', '3t']
>>>
>>> [x+y for x in '1' for y in 'pot']
['1p', '1o', '1t']
>>>

In both cases, the first "for" forms the outer loop and hte second "for" forms the inner loop. That is the only invariant here.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM