简体   繁体   中英

Implement `__iter__()` and `__next__()` in different

I'm reading a book on Python which illustrates how to implement the iterator protocol.

class Fibbs:
    def __init__(self):
        self.a = 0
        self.b = 1
    def __next__(self):
        self.a, self.b = self.b, self.a + self.b
        return self.a
    def __iter__(self):
        return self

Here, self itself is the iterable and iterator, I believe? However, the para below says:

Note that the iterator implements the __iter__ method, which will, in fact, return the iterator itself. In many cases, you would put the __iter__ method in another object, which you would use in the for loop. That would then return your iterator. It is recommended that iterators implement an __iter__ method of their own in addition (returning self, just as I did here), so they themselves can be used directly in for loops.

Does this mean you can put __iter__() and __next__() in two different objects? Can it be done for objects belonging to different classes? Can it only be done for objects belonging to different classes? It might be a bit bizarre way of implementing the iterator protocol. But I just want to see how, provided it can actually be implemented like that.

How you make iterators and iterables

There are two ways to do this:

  1. Implement __iter__ to return self and nothing else, implement __next__ on the same class. You've written an itera tor .
  2. Implement __iter__ to return some other object that follows the rules of #1 (a cheap way to do this is to write it as a generator function so you don't have to hand-implement the other class). Don't implement __next__ . You've written an itera ble that is not an itera tor .

For correctly implemented versions of each protocol, the way you tell them apart is the __iter__ method. If the body is just return self ( maybe with a logging statement or something, but no other side-effects), then either it's an iterator, or it was written incorrectly. If the body is anything else, then either it's a non-iterator iterable, or it was written incorrectly. Anything else is violating the requirements for the protocols.

In case #2, the other object would be of another class by definition (because you either have an idempotent __iter__ and implement __next__ , or you only have __iter__ , without __next__ , which produces a new iterator).


Why the protocol is designed this way

The reason you need __iter__ even on iterators is to support patterns like:

 iterable = MyIterable(...)
 iterator = iter(iterable)  # Invokes MyIterable.__iter__
 next(iterator, None)  # Throw away first item
 for x in iterator:    # for implicitly calls iterator's __iter__; dies if you don't provide __iter__

The reason you always return a new iterator for iterables, rather than just making them iterators and resetting the state when __iter__ is invoked is to handle the above case (if MyIterable just returned itself and reset iteration, the for loop's implicit call to __iter__ would reset it again and undo the intended skip of the first element) and to support patterns like this:

 for x in iterable:
     for y in iterable:  # Operating over product of all elements in iterable

If __iter__ reset itself to the beginning and only had a single state, this would:

  1. Get the first item and put it in x
  2. Reset, then iterate through the whole of iterable putting each value in y
  3. Try to continue outer loop, discover it's already exhausted, never give any other value to x

It's also needed because Python assumes that iter(x) is x is a safe, side-effect free way to test if an iterable is an iterator. If your __iter__ modifies your own state, it's not side-effect free. At worst, for iterables, it should waste a little time making an iterator that is immediately thrown away. For iterators, it should be effectively free (since it just returns itself).


To answer your questions directly:

Does this mean you can put __iter__() and __next__() in two different objects?

For itera tor s, you can't (it must have both methods, though __iter__ is trivial). For non-itera tor itera ble s, you must (it must only have __iter__ , and return some other itera tor object). There is no "can".

Can it be done for objects belonging to different classes?

Yes.

Can it only be done for objects belonging to different classes?

Yes.


Examples

Example of itera ble :

class MyRange:
    def __init__(self, start, stop):
         self.start = start
         self.stop = stop

    def __iter__(self):
         return MyRangeIterator(self)  # Returns new iterator, as this is a non-iterator iterable

    # Likely to have other methods (because iterables are often collections of
    # some sort and support many other behaviors)
    # Does *not* have __next__, as this is not an iterator

Example of itera tor :

class MyRangeIterator:  # Class is often non-public and or defined inside the iterable as
                        # nested class; it exists solely to store state for iterator
    def __init__(self, rangeobj):  # Constructed from iterable; could pass raw values if you preferred
        self.current = rangeobj.start
        self.stop = rangeobj.stop
    def __iter__(self):
        return self             # Returns self, because this is an iterator
    def __next__(self):         # Has __next__ because this is an iterator
        retval = self.current   # Must cache current because we need to modify it before we return
        if retval >= self.stop:
            raise StopIteration # Indicates iterator exhausted
        self.current += 1       # Ensure state updated for next call
        return retval           # Return cached value

    # Unlikely to have other methods; iterators are generally iterated and that's it

Example of "easy iterable" where you don't implement your own iterator class, by making __iter__ a generator function:

class MyEasyRange:
    def __init__(self, start, stop): ... # Same as for MyRange

    def __iter__(self):  # Generator function is simpler (and faster)
                         # than writing your own iterator class
         current = self.start  # Can't mutate attributes, because multiple iterators might rely on this one iterable
         while current < self.stop:
             yield current     # Produces value and freezes generator until iteration resumes
             current += 1
         # reaching the end of the function acts as implicit StopIteration for a generator

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM