Python 如何在内部管理“for”循环？

Question

我正在尝试学习 Python，并开始使用一些代码：

a = [3,4,5,6,7]
for b in a:
    print(a)
    a.pop(0)

输出是：

[3, 4, 5, 6, 7]
[4, 5, 6, 7]
[5, 6, 7]

我知道在循环时更改数据结构不是一个好的做法，但我想了解在这种情况下 Python 如何管理迭代器。

主要问题是：如何知道，它已经完成循环，如果我改变的状态下a ？

Answer 1

您不应该这样做的原因正是因为您不必依赖于迭代的实现方式。

但是回到问题。 Python 中的列表是数组列表。 它们代表一块连续的已分配内存，而不是链表，其中每个元素都独立分配。 因此，Python 的列表，就像 C 中的数组一样，针对随机访问进行了优化。 换句话说，从元素 n 到元素 n+1 的最有效方法是直接访问元素 n+1（通过调用mylist.__getitem__(n+1)或mylist[n+1] ）。

因此，列表的__next__ （每次迭代调用的方法）的实现正如您所期望的：当前元素的索引首先设置为 0，然后在每次迭代后增加。

在您的代码中，如果您还打印b ，您将看到这种情况发生：

a = [3,4,5,6,7]
for b in a:
    print a, b
    a.pop(0)

结果：

[3, 4, 5, 6, 7] 3
[4, 5, 6, 7] 5
[5, 6, 7] 7

因为：

在迭代 0 时， a[0] == 3 。
在迭代 1 时， a[1] == 5 。
在迭代 2 时， a[2] == 7 。
在迭代 3 时，循环结束（ len(a) < 3 ）

Answer 2

kjaquier 和 Felix 已经讨论过迭代器协议，我们可以在你的案例中看到它的作用：

>>> L = [1, 2, 3]
>>> iterator = iter(L)
>>> iterator
<list_iterator object at 0x101231f28>
>>> next(iterator)
1
>>> L.pop()
3
>>> L
[1, 2]
>>> next(iterator)
2
>>> next(iterator)
Traceback (most recent call last):
  File "<input>", line 1, in <module>
StopIteration

由此我们可以推断list_iterator.__next__代码行为类似于：

if self.i < len(self.list):
    return self.list[i]
raise StopIteration

它不会天真地获取该项目。 这会引发一个IndexError ，它会冒泡到顶部：

class FakeList(object):
    def __iter__(self):
        return self

    def __next__(self):
        raise IndexError

for i in FakeList():  # Raises `IndexError` immediately with a traceback and all
    print(i)

确实，查看CPython 源代码中的listiter_next （感谢 Brian Rodriguez）：

if (it->it_index < PyList_GET_SIZE(seq)) {
    item = PyList_GET_ITEM(seq, it->it_index);
    ++it->it_index;
    Py_INCREF(item);
    return item;
}

Py_DECREF(seq);
it->it_seq = NULL;
return NULL;

虽然不知道怎么return NULL; 最终转化为StopIteration 。

Answer 3

通过使用一个小辅助函数foo我们可以很容易地看到事件的顺序：

def foo():
    for i in l:
        l.pop()

和dis.dis(foo)查看生成的 Python 字节码。 剪掉不那么相关的操作码，您的循环执行以下操作：

          2 LOAD_GLOBAL              0 (l)
          4 GET_ITER
    >>    6 FOR_ITER                12 (to 20)
          8 STORE_FAST               0 (i)

         10 LOAD_GLOBAL              0 (l)
         12 LOAD_ATTR                1 (pop)
         14 CALL_FUNCTION            0
         16 POP_TOP
         18 JUMP_ABSOLUTE            6

也就是说，它得到的是iter给定对象（ iter(l)一个专门的迭代器对象的列表）和循环，直到FOR_ITER信号，它的时间停止。 添加多汁的部分，这是FOR_ITER所做的：

PyObject *next = (*iter->ob_type->tp_iternext)(iter);

本质上是：

list_iterator.__next__()

这（最后^* ）经过listiter_next ，它在检查期间使用原始序列l以listiter_next执行索引检查。

if (it->it_index < PyList_GET_SIZE(seq))

当此操作失败时，返回NULL ，表示迭代已完成。 与此同时，设置了一个StopIteration异常，该异常在FOR_ITER操作码代码中被静默抑制：

if (!PyErr_ExceptionMatches(PyExc_StopIteration))
    goto error;
else if (tstate->c_tracefunc != NULL)
    call_exc_trace(tstate->c_tracefunc, tstate->c_traceobj, tstate, f);
PyErr_Clear();  /* My comment: Suppress it! */

因此，无论您是否更改列表， listiter_next的检查最终都会失败并执行相同的操作。

^{*对于任何想知道的人来说， listiter_next是一个描述符，所以有一个小函数包装它。} ^{在这种特殊情况下，该功能是wrap_next这使得一定要设置PyExc_StopIteration作为一个异常时， listiter_next返回NULL 。}

Answer 4

AFAIK，for 循环使用迭代器协议。 您可以手动创建和使用迭代器，如下所示：

In [16]: a = [3,4,5,6,7]
    ...: it = iter(a)
    ...: while(True):
    ...:     b = next(it)
    ...:     print(b)
    ...:     print(a)
    ...:     a.pop(0)
    ...:
3
[3, 4, 5, 6, 7]
5
[4, 5, 6, 7]
7
[5, 6, 7]
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-16-116cdcc742c1> in <module>()
      2 it = iter(a)
      3 while(True):
----> 4     b = next(it)
      5     print(b)
      6     print(a)

如果迭代器耗尽（引发StopIteration ），则 for 循环停止。

Python 如何在内部管理“for”循环？

问题描述

4 个解决方案

解决方案1
14 2017-04-04 12:15:42

解决方案2
9 已采纳 2017-04-04 12:30:15

解决方案3
2 2017-04-04 17:10:55

解决方案4
1 2017-04-04 12:19:40

Python 如何在内部管理“for”循环？

问题描述

4 个解决方案

解决方案1 14 2017-04-04 12:15:42

解决方案2 9 已采纳 2017-04-04 12:30:15

解决方案3 2 2017-04-04 17:10:55

解决方案4 1 2017-04-04 12:19:40

解决方案1
14 2017-04-04 12:15:42

解决方案2
9 已采纳 2017-04-04 12:30:15

解决方案3
2 2017-04-04 17:10:55

解决方案4
1 2017-04-04 12:19:40