简体   繁体   English

为什么`zip`从第一个迭代器中又吃掉一个元素?

[英]Why does `zip` eat one more element from the first iterator?

I'm studying Item 76 in Effective Python (2nd Ed.), and I run into a case that I don't understand.我正在研究 Effective Python(第 2 版)中的第 76 项,遇到了一个我不明白的案例。 Specifically, I don't understand why zip consumes one additional element of its first argument.具体来说,我不明白为什么zip会消耗其第一个参数的一个附加元素。 Consider the following code:考虑以下代码:

l1 = []
l2 = [1]
l1_it = iter(l1)
l2_it = iter(l2)
test_it = zip(l2_it, l1_it)
_ = list(test_it):

try:
    next(l2_it)
except StopIteration:
    print('This should not happen')

This actually prints This should not happen , and I find this very surprising.这实际上打印了This should not happen ,我觉得这非常令人惊讶。 I would expect zip to leave its first argument in a state where there is still one element to be retrieved.我希望zip将其第一个参数留在 state 中,其中仍有一个元素要检索。 And the fact is that if I use zip(l1_it, l2_it) (that is, the shortest list is first), then indeed I can call next(l2_it) without triggering an exception.事实是,如果我使用zip(l1_it, l2_it) (即最短的列表在前),那么我确实可以调用next(l2_it)而不会触发异常。

Is this expected?这是预期的吗?

zip takes the length of the shortest iterable and limits to that. zip采用最短可迭代的长度并限制于此。 Since l1 has no items, the item in l2 will not be processed.由于l1没有项目,所以不会处理l2中的项目。

However... with an iterable Python doesn't know how many items are available so its only recourse is to try and fetch an item.但是...对于可迭代的 Python 不知道有多少项目可用,因此它唯一的办法是尝试获取项目。 If there is no item it gets an exception.如果没有项目,则会出现异常。 If there is an item, it has already consumed it from the iterator by now.如果有一个项目,它现在已经从迭代器中消费了它。

Perhaps you were expecting the behaviour of zip_longest?也许您期待 zip_longest 的行为? https://docs.python.org/3/library/itertools.html?highlight=itertools#itertools.zip_longest https://docs.python.org/3/library/itertools.html?highlight=itertools#itertools.zip_longest

To illustrate the iter() behaviour:为了说明iter()行为:

>>> x = [1]
>>> x_iter = iter(x)

# We cannot get the length and unless we are prepared to fetch it,
# we cannot check if there are items available.
>>> len(x_iter)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: object of type 'list_iterator' has no len()

# First item can be fetched
>>> next(x_iter)
1

# There is no second item so Python raises a StopIteration
>>> next(x_iter)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

I had exactly this problem and after reading this and answers i understood, why this happens and simply changed the order of zipped iterables.我确实遇到了这个问题,在阅读了这个和答案之后,我明白了为什么会发生这种情况,并且只是改变了压缩迭代的顺序。 The one that ends before the other (the shorter iterable) first and then the one where should not be "eaten" from an (on the first view) 'unexpected' additional element" as second and the problem is solved.一个在另一个(较短的可迭代)之前结束的那个,然后是一个不应该从(在第一个视图上)“意外”附加元素中“吃掉”的那个,作为第二个,问题就解决了。

So basically as conclusion in my opinion one should remember always to put shorter iterable as first and longer iterable as second argument for zip to avoid "surprises".因此,在我看来,基本上作为结论,人们应该记住始终将较短的可迭代作为 zip 的第一个参数和较长的可迭代作为第二个参数,以避免“意外”。

fe to use similar as your example: fe 使用类似于您的示例:

>>> l1 = [1]                      # the shorter one for left side
>>> l2 = [2, 3]                   # the longer one for right side
>>> l1_it = iter (l1)
>>> l2_it = iter (l2)
>>> test_it = zip (l1_it, l2_it)  # zip now has stop iteration
                                  # from l1_it and leaves l2_it
                                  # untouched further
>>> print (list (test_it))        # go through test_it with list
[(1, 2)]                          # and print it

>>> next (l2_it)                  # as you can see here there is
3                                 # still a next (now as expected)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM