简体   繁体   中英

How Python Generators know who's calling?

This question is making me pull my hair out.

if I do:

def mygen():
    for i in range(100):
        yield i

and call it from one thousand threads, how does the generator knows what to send next for each thread? Everytime I call it, does the generator save a table with the counter and the caller reference or something like that?

It's weird.

Please, clarify my mind on that one.

mygen does not have to remember anything. Every call to mygen() returns an independent iterable. These iterables, on the other hand, have state: Every time next() is called on one, it jumps to the correct place in the generator code -- when a yield is encountered, control is handed back to the caller. The actual implementation is rather messy, but in principle you can imagine that such an iterator stores the local variables, the bytecode, and the current position in the bytecode (aka instruction pointer). There is nothing special about threads here.

A function like this, when called, will return a generator object. If you have separate threads calling next() on the same generator object, they will interfere with eachother. That is to say, 5 threads calling next() 10 times each will get 50 different yields.

If two threads each create a generator by calling mygen() within the thread, they will have separate generator objects.

A generator is an object, and its state will be stored in memory, so two threads that each create a mygen() will refer to separate objects. It'd be no different than two threads creating an object from a class , they'll each have a different object, even though the class is the same.

if you're coming at this from a C background, this is not the same thing as a function with static variables. The state is maintained in an object, not statically in the variables contained in the function.

It might be clearer if you look at it this way. Instead of:

for i in mygen():
    . . .

use:

gen_obj = mygen()
for i in gen_obj:
    . . .

then you can see that mygen() is only called once, and it creates a new object, and it is that object that gets iterated. You could create two sequences in the same thread, if you wanted:

gen1 = mygen()
gen2 = mygen()
print(gen1.__next__(), gen2.__next__(), gen1.__next__(), gen2.__next__())

This will print 0, 0, 1, 1.

You could access the same iterator from two threads if you like, just store the generator object in a global:

global_gen = mygen()

Thread 1:

for i in global_gen:
    . . .

Thread 2:

for i in global_gen:
    . . .

This would probably cause all kinds of havoc. :-)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM