简体   繁体   中英

Python generator of generator

Here's a problem I get on an interview (python 3.7):

def add(x,y):
    return x+y

g = (x for x in range(4))
for n in [1,10]:
    g = (add(n,i) for i in g)
list(g)

What does list(g) print? Answer is

20,21,22,23

From the output, I guess what happened is the add function looped twice, and both of the time n=10? Can someone explain to me what happens step by step? I am so confused. Much thanks.

The "body" of a generator expression does not capture values in a closure, so n is just a free variable whose value is whatever is assigned to n once g is evaluated. (The expression iterated over is, so g is not a free variable, but the iterable currently assigned to g .)

That is, after for loop, you have

assert n == 10  # The last value assigned to n

# Pseudocode - every time n is used, it resolves to the *current*
# value of n, not the value n had when the generator expression was 
# defined.
g = (add(10, i) for i in (add(10, i) for i in (x for x in range(4))))
#  *not* (add(10, i) for i in (add(1, i) for i in (x for x in range(4))))
  = (add(10, i) for i in (add(10, i) for i in (0, 1, 2, 3)))
  = (add(10, i) for i in (10, 11, 12, 13))
  = (10 + i for i in (10, 11, 12, 13)

And so

list(g) == [20, 21, 22, 23]

Because g is generator object .

Unlike listcomp which is calculated immediately , it's just a generator instance waiting to be iterated.

>>> from inspect import getgeneratorstate
>>> g = (x for x in range(4))
>>> getgeneratorstate(g)
'GEN_CREATED'

>>> next(g)
0
>>> getgeneratorstate(g)
'GEN_SUSPENDED'

>>> list(g)
[1, 2, 3]
>>> getgeneratorstate(g)
'GEN_CLOSED'

However, reference to first generator (x for x in range(4)) does not change inside generator object. Because g is just a reference on a object on memory.

Name is merely a post-it on a box. - Fluent Python.

流利的 Python

So when we pass g , mere memory address of referencing object is passed, not g itself. Therefore, in following case:

>>> g = (x for x in range(4))
>>> g
<generator object <genexpr> at 0x036babbc>

>>> g = (add(n, i) for i in g)

g inside generator expression (add(n, i) for i in g) is merely passing mem-address 0x036babbc to expression, and generator instance created from that expression remembers that address, so even if g is redeclared that does not affect already created generator instances.

So in sequenece:

>>> g = (x for x in range(4))
>>> g
<generator object <genexpr> at 0x0452c22c>  # 1

>>> g = (add(10, i) for i in g)
>>> g
<generator object <genexpr> at 0x044ee178>  # 2
>>> g.gi_frame.f_locals['.0']
<generator object <genexpr> at 0x0452c22c>  # 1 stored

>>> g = (add(10, i) for i in g)
>>> g
<generator object <genexpr> at 0x03bd88c8>  # 3
>>> g.gi_frame.f_locals['.0']
<generator object <genexpr> at 0x044ee178>  # 2 stored

As you see, each generator expressions remembers last referenced generator instances , so it's keep getting nested.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM