I am currently reading in the official documentation of Python 3.5.
It states that range()
is iterable, and that list()
and for
are iterators. [section 4.3]
However, here it states that zip()
makes an iterator.
My question is that when we use this instruction:
list(zip(list1, list2))
are we using an iterator ( list()
) to iterate through another iterator?
The documentation is creating some confusion here, by re-using the term 'iterator'.
There are three components to the iterator protocol :
Iterables; things you can potentially iterate over and get their elements, one by one.
Iterators; things that do the iteration. Every time you want to step through all items of an iterable, you need one of these to keep track of where you are in the process. These are not re-usable; once you reach the end, that's it. For most iterables, you can create multiple indepedent iterators, each tracking position independently.
Consumers of iterators; those things that want to do something with the items.
A for
loop is an example of the latter, so #3. A for
loop uses the iter()
function to produce an iterator (#2 above) for whatever you want to loop over, so that "whatever" must be an iterable (#1 above).
range()
is an example of #1; it is iterable object. You can iterate over it multiple times, independently:
>>> r = range(5)
>>> r_iter_1 = iter(r)
>>> next(r_iter_1)
0
>>> next(r_iter_1)
1
>>> r_iter_2 = iter(r)
>>> next(r_iter_2)
0
>>> next(r_iter_1)
2
Here r_iter_1
and r_iter_2
are two separate iterators, and each time you ask for a next item they do so based on their own internal bookkeeping.
list()
is an example of both an iterable (#1) and a iteration consumer (#3) . If you pass another iterable (#1) to the list()
call, a list object is produced containing all elements from that iterable. But list objects themselves are also iterables.
zip()
, in Python 3, takes in multiple iterables (#1), and is itself an iterator (#2). zip()
stores a new iterator (#2) for each of the iterables you gave it. Each time you ask zip()
for the next element, zip()
builds a new tuple with the next elements from each of the contained iterables:
>>> lst1, lst2 = ['foo', 'bar'], [42, 81]
>>> zipit = zip(lst1, lst2)
>>> next(zipit)
('foo', 42)
>>> next(zipit)
('bar', 81)
So in the end, list(zip(list1, list2))
uses both list1
and list2
as iterables (#1), zip()
consumes those (#3) when it itself is being consumed by the outer list()
call.
The documentation is badly worded. Here's the section you're referring to:
We say such an object is iterable , that is, suitable as a target for functions and constructs that expect something from which they can obtain successive items until the supply is exhausted. We have seen that the
for
statement is such an iterator . The functionlist()
is another; it creates lists from iterables:
In this paragraph, iterator does not refer to a Python iterator object, but the general idea of "something which iterates over something". In particular, the for
statement cannot be an iterator object because it isn't an object at all; it's a language construct.
To answer your specific question:
... when we use this instruction:
list(zip(list1, list2))
are we using an iterator (
list()
) to iterate through another iterator?
No, list()
is not an iterator. It's the constructor for the list
type. It can accept any iterable (including an iterator) as an argument, and uses that iterable to construct a list.
zip()
is an iterator function, that is, a function which returns an iterator. In your example, the iterator it returns is passed to list()
, which constructs a list
object from it.
A simple way to tell whether an object is an iterator is to call next()
with it, and see what happens:
>>> list1 = [1, 2, 3]
>>> list2 = [4, 5, 6]
>>> zipped = zip(list1, list2)
>>> zipped
<zip object at 0x7f27d9899688>
>>> next(zipped)
(1, 4)
In this case, the next element of zipped
is returned.
>>> list3 = list(zipped)
>>> list3
[(2, 5), (3, 6)]
Notice that only the last two elements of the iterator are found in list3
, because we already consumed the first one with next()
.
>>> next(list3)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'list' object is not an iterator
This doesn't work, because lists are not iterators.
>>> next(zipped)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
This time, although zipped
is an iterator, calling next()
with it raises StopIteration
because it's already been exhausted to construct list3
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.