简体   繁体   中英

python iterators, generators and in between

So I get generator functions for lazy evaluation and generator expressions, aka generator comprehensions as its syntactic sugar equivalent.

I understand classes like

class Itertest1:
    def __init__(self):
        self.count = 0
        self.max_repeats = 100

    def __iter__(self):
        print("in __inter__()")
        return self

    def __next__(self):
        if self.count >= self.max_repeats:
            raise StopIteration
        self.count += 1
        print(self.count)
        return self.count

as a way of implementing the iterator interface, ie iter () and next () in one and the same class.

But what then is

class Itertest2:
    def __init__(self):
        self.data = list(range(100))

    def __iter__(self):
        print("in __inter__()")
        for i, dp in enumerate(self.data):
            print("idx:", i)
            yield dp

which uses the yield statement within the iter member function?

Also I noticed that upon calling the iter member function

it = Itertest2().__iter__()
batch = it.__next__()

the print statement is only executed when calling next () for the first time. Is this due to this weird mixture of yield and iter? I think this is quite counter intuitive...

Having the yield statement anywhere in any function wraps the function code in a (native) generator object, and replaces the function with a stub that gives you said generator object.

So, here, calling __iter__ will give you an anonymous generator object that executes the code you want.

The main use case for __next__ is to provide a way to write an iterator without relying on (native) generators.

The use case of __iter__ is to distinguish between an object and an iteration state over said object. Consider code like

c = some_iterable()
for a in c:
    for b in c:
        # do something with a and b

You would not want the two interleaved iterations to interfere with each other's state. This is why such a loop would desugar to something like

c = some_iterable()
_iter1 = iter(c)
try:
    while True:
        a = next(_iter1)
        _iter2 = iter(c)
        try:
            while True:
                b = next(_iter2)
                # do something with a and b
        except StopIteration:
            pass
 except StopIteration:
     pass

Typically, custom iterators implement a stub __iter__ that returns self , so that iter(iter(x)) is equivalent to iter(x) . This is important when writing iterator wrappers.

Something equivalent to Itertest2 could be written using a separate iterator class.

class Itertest3:
    def __init__(self):
        self.data = list(range(100))

    def __iter__(self):
        return Itertest3Iterator(self.data)


class Itertest3Iterator:
    def __init__(self, data):
        self.data = enumerate(data)

    def __iter__(self):
        return self

    def __next__(self):
        print("in __inter__()")
        i, dp = next(self.state)  # Let StopIteration exception propagate
        print("idx:", i)
        return dp

Compare this to Itertest1 , where the instance of Itertest1 itself carried the state of the iteration around in it. Each call to Itertest1.__iter__ returned the same object (the instance of Itertest1 ), so they couldn't iterate over the data independently.

Notice I put print("in __iter__()") in __next__ , not __iter__ . As you observed, nothing in a generator function actually executes until the first call to __next__ . The generator function itself only creates an generator; it does not actually start executing the code in it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM